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(54) Data processing apparatus and method 

(57) In decoding code data encoded in object units, 
decoders corresponding to the number of objects are 
needed. However, it is impossible to always provide a 
sufficient number of decoder. Accordingly, when code 
data 8 is decoded, an object combiner 43 refers to the 
number s of objects included in the code data 8. 



detected by an object counter 41 , and the number d of 
object decoders, detected by an object decoder counter 
42. If s>d holds, the object combiner 43 regulates the 
number of the objects of the input code data 8 to d. 
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Description 

BACKGROUND OF THE INVENTION 

5 FIELD OF THE INVENTION 

[0001] The present Invention relates to decoding apparatus and method and, more particularly, to data processing 
apparatus and method which decode code encoded in image object units. 

[0002] Further, the present invention relates to data processing apparatus and method which process a data array 
10 constructing an image with a plurality of coded image objects. 

DESCRIPTION OF RELATED ART 

[0003] In recent years, with advancement in image encoding techniques and progress of computer capabilities, an 
75 encoding method to separate an image into objects and encode by each object has been proposed. The image encod- 
ing in object units enables optimum encoding by each object, thus improving the coding efficiency. At the same time, a 
function to generate a new image by editing the objects within the image can be obtained. For example, in the technol- 
ogy of still image, a method to separate an image into "character", "line", "frame", "image", "table" and "background", 
and perform optimum encoding on the respective areas, such as the ACBIS method (by Maeda, and Yoshida in 'The 
20 1996 Institute of Electronics, Information and Communication Engineers General Conference D-292") has been pro- 
posed. According to this method, the JBIG (Joint Bi-level Image Group) encoding as a binary-image encoding method 
is performed on the "character", "line", "frame" and "table" areas, and in the "background" area, its representative value 
is encoded. 

[0004] Further, in a moving image, a method to perform encoding in object units has been studied as an international 
25 Standard method, MPEG4 (Moving Picture Experts Group phase 4) (Eto, 'MPEG4 Standardization" (The Journal of The 
Institute of Image Electronics Engineers of Japan, vol. 25, No. 3, 1996, pp. 223-228). Fig. 1 shows an example of a 
frame of a moving image to be encoded by the MPEG4 coding. In Fig. 1 , a frame 20 comprises four objects as shown 
in Fig. 2, i.e., a baclground object 28, an object 21 representing a helicopter, an object 22 representing a train, and an 
object 23 representing a car. To indicate the shapes of the objects except the background, each object is masked such 
30 that a black part of a rectangular area sun-ounding the object is an "outer area", and a white part is an "inner area" (24 
to 26 in Fig. 2), and by this masking, an arbitrary shaped object can be handled. 

[0005] Fig. 3 shows a construction for coding in object units. An input image 1 is inputted into an object segmenter 2, 
and is separated into respective objects. For example, the image in Fig. 1 is separated by the object segmenter 2 into 
the objects 28, 21, 22 and 23, and the objects are independently encoded. That is, an object encoder 3 encodes the 
35 object 28; an object encoder 4, the object 21 ; an object encoder 5, the object 22; and an object encoder 6, the object 
23. A multiplexer 7 multiplexes code data outputted from the object encoders 3 to 6, and outputs the multiplexed data 
as code data 8. 

[0006] Rg. 4 shows a construction for decoding an image encoded in object units. The code data 8 is inputted into a 
demultiplexer 9, and separated into code data corresponding to the respective objects. The separated code data are 
40 independently decoded. That is, an object decoder 10 decodes the object 28; an object decoder 1 1 , the object 21 ; an 
object decoder 12, the object 22; and an object decoder 13, tiie object 23. An object compositer 14 arranges image data 
outputted from the object decoders 1 0 to 1 3 in proper positions for the respective objects, thus composes them as one 
image, and outputs the image data as a reproduced image 15. 

[0007] In moving image coding represented by the MPEG2 (Moving Picture Experts Group phase 2) standard, coding 
45 is made in frame or field units. To realize reuse or editing of contents (person, building, voice, sound, background and 
the like) constructing a video image and audio data of a moving image, the MPEG4 standard is characterized by han- 
dling video data and audio data as objects. Further, objects included in a video image area independently encoded, and 
the objects are independently handled. 

[0008] Fig. 25 shows an example of the structure of object code data. The moving image code data based on the 
50 MPEG4 standard has a hierarchical sti-ucture, from the point of improvement in coding efficiency and editing operability. 
As shown in Fig. 25, the head of code data has a visual_object_sequence_start_code (VOSSC in Fig. 25) for identifi- 
cation. Then, code data of respective visual objects follows, and visual_object_sequence_end_code (VOSEC in Hg. 
25) indicative of the rear end of the code data is positioned at the end. As well as obtained moving images, computer 
graphics (CG) data and ttie like are defined as visual objects. 
55 [0009] The visual object data has visual_object_start_code (Visual Object SC in Fig. 25) for identification at its 
header, then profile_and_level_indication (PLI in Fig. 25) indicative of an encoding level. Then, information on visual 
objects, is_visual-object_identifier (IVOI in Fig. 25), visual_object_varid (VOVID in Fig. 25), visual_object_priority 
(VOPRI in Fig. 25), vlsual_object_type (VOTYPE in Fig. 25) and tine like follow. These data consb-uct header information 
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of the visual object. "VOTYPE" has a value "0001" if the image is a moving image obtained by image pickup. Then, 
video object (VO) data as a cluster of moving Image code data follows. 

[001 0] The VO data is code data indicative of each object. The VO data has video_object_start_code (VOSC in Fig. 

25) for identification at its header, further, the VO data has video object layer data (VOL data in Fig. 25) to realize scal- 
5 ability. The VOL data has video_objectJayer_start_code (VOLSC in Fig. 25) and video object plane data (VOP data in 

Fig. 25) corresponding to one frame of moving image. The VOL data has video_objectJayer_width (VOL_width in Rg. 

25) and video_object_layer_height (VOL_height in Rg. 25) indicative of size, at its header. Also, the VOP data has 

video_ol:^ect_plane_width (VOP_width in Fig. 25) and video_objectj3lane_height (VOP_height in Rg. 25) indicative of 

size, at Its header. Further, the header of the VOL data has blt_rate code Indicative of bit rate. 
10 [0011] Note that in each layer of the code data structure, data of an arbitrary length which starts with 

user_data_start_code can be inserted by a user. The user data is distinguished from the code data by recognition of 

start code VOSC and VOLSC or VOPSC following the user data. 

[001 2] Further, arrangement Information, which is information to arrange the respective objects on the decoding side, 
is called a system code. In the system code, similar to VRML (Virtual Reality Markup Language) as a CQ language, 
15 information describing arrangement of divided objects, reproduction timing or the like is encoded. The system code 
desaibes the relation among the respective objects with conception of nodes. Hereinbelow, the nodes will be specifi- 
cally described with reference to Figs. 26 and 27. 

[001 3] Fig. 26 is an example of an image constructed with a plurality of objects. This image comprises a Background 
object 2000, a Balloon object 2001 , a Bird object 2002, a Jet object 2003, a Car object 2004, a Woman object 2005 and 

20 a Man object 2006, respectively representing background, a tjalloon, a bird, an airplane, a car, a woman and a man. 
[0014] Fig. 27 shows a node tree In the image in Fig. 26. The entire image is represented by a Scene node. The 
Scene node is connected to tiie Background object 2000, tiie Car object 2004, and a People node Indicative of people 
and a Fly node indicative of things flying in the sky. Further, the People node is connected to the Woman object 2005 
and the Man object 2006. The Fly node is connected to the Balloon object 2001 , the Bird object 2002 and the Jet object 

25 2003. The relation among ttie objects is described in the data of the system code. 

[001 5] In this manner, according to the MPEG4 standard, by handling objects in a moving image independently, the 
decoding side can freely arrange various objects. Further, in broadcasting companies, contents producing companies 
and the like, by generating code data of objects beforehand, a very large number of moving Image data can be gener- 
ated from limited contents. 

30 [0016] However, the above-described techniques have ttie following problems. To decode respective objects inde- 
pendently, decoders corresponding to the number of separated objects are required. However, on the decoding side, it 
is impossitrfe to prepare an arbiti-ary number of decoders. Accordingly, the number of independently encoded objects 
may be larger than the number of prepared decoders. The decoding apparatus as shown in Rg. 5 has three object 
decoders. A demultiplexer 9 allocates the object decoders to ttie code data 8 in input order. H ttie code data 8 includes 

35 four objects, the demultiplexer 9 allocates ttie object 28 to the object decoder 1 0, the object 21 , to the object decoder 
11, and the object 22, to ttie object decoder 12. However, regarding the object 23. as ttiere Is no available object 
decoder, ttie object 23 is not decoded. Accordingly, in an image obtained by decoding ttie objects and synthesizing 
them, ttie object 23 is omitted, as in a frame 38 In Fig. 6. 

[001 7] That is, in ttie coding based on ttie MPEG4 standard, as an unspecified number of objects are handled, ttie 
40 number of decoding means to decode all ttie objects cannot be determined especially on ttie decoding side, accord- 
ingly, it is very difficult to consti-uct an apparatijs or system. For this reason, in ttie standardized MPEG4 coding, to 
determine the specifications upon designing of code data and encoder/decoder, ttie concepts of profile and level are 
defined and ttie number of objects and the upper limit value of bit rate are provided as coding specifications. Rg. 28 
shows an example of a profile table defining ttie number of objects and the bit rate upper limits of profiles and levels. 
45 [0018] In the MPEG4 standard, a coding tool differs in accordance with profile. Furttier. as shown in Fig. 28, ttie 
amount of code data of handled image Is determined stepwisely in accordance with level. Note ttiat ttie maximum 
number of objects to be handled and ttie maximum bit rate value are upper limits in ttie coding specifications, and all 
the values are included In the coding specifications as long as they are less ttian the above maximum values. For exam- 
ple, in a case where a coding tool is available in a Core profile, ttie number of objects is six, and coding is performed at 
50 a bit rate of 300 kbps, the code data and ttie coding tool con-espond to level 2 (Core profile and level 2). 

[001 9] The above-described profile and level are Indicated In tiie PLI in a bit sb-eam of MPEG4 code data as shown 
in Fig. 25. Accordingly, a decoder which decodes a bit stream of MPEG code data can determine whettier or not decod- 
ing is possible by referring to ttie PLI. The decoding is impossible in the following case. 

[0020] For example, a decoder conesponding to of Core profile and level 1 cannot decode code data of Core profile 
55 and level 2 since ttie maximum bit rate of Core profile and level 2 Is 2000 kbps. far higher than 384 kbps as the maxi- 
mum bit rate of Core profile and level 1. 

[0021] Further, in an Image including four objects, by synttiesizing two code data of Simple profile and level 1 , two 
code data of Simple profile and level 2 can be generated. However, as ttie maximum number of objects of Simple profile 



3 

Copied from 09964647 on 02/18/2005 



EP0 954 181 A2 



and level 2 is 4, code data which cannot belong to any profile or level of the MPEG4 standard is generated. Accordingly, 
such coded data cannot be decoded. 

[0022] Further, for exanple, if a new bit stream is generated by multiplexing two code data of Simple profile, with bit 
rates 48 kbps and 8 t<bps, of two images respectively including two objects, the bit rate of the new bit stream may be 
5 over 64 kbps. In this case, the level of the code data must be raised to level 2, and it cannot be decoded by a decoder 
of Simple profile and level 1 . 

[0023] That is, if the coding specifications (profile and level) of a decoder do not sufficiently cover the coding specifi- 
cations (profile and level) of code data, the decoder cannot decode the code data. 

[0024] This problem becomes especially outstanding upon synthesizing a plurality of images. For example, when a 
10 plurality of code data, respectively decodable by a decoder are synthesized, occasionally the decoder cannot decode 
the synthesized code data. Further, H the synthesized code data does not correspond to any of MPEG4 Profiles and 
levels, it cannot be decoded by a decoder based on the MPEG4 standard. 

SUMMARY OF THE INVENTION 

15 

[0025] The present invention has been made to solve the above-described problems, and has as a concern to provide 
data processing apparatus and method which decode all the image objects even if the number of decoders is limited. 
[0026] According to the present invention there is provided a data processing apparatus having decoding means for 
decoding code encoded in image object units, said apparatus comprising: detection means for detecting the number of 
20 objects included in input code and the number of objects decodable by said decoding means; and control means for 
controlling the number of objects of the input code, based on the number of objects and the number of decodable 
objects detected by said detection means. 

[0027] Further, another concern of the present invention is to provide data processing apparatus and method which 
decode coded still image and/or moving image without degrading the image quality even if the number of decoders is 
25 limited. 

[0028] According to a feature of the present invention, the above-described apparatus further comprises; extraction 
means for extracting location information of the objects included in said code; and combining means for combining code 
of a plurality of objects, based on an instruction from said control means and the location information extracted by said 
extraction means. 

30 [0029] Further the above-described apparatus may comprise: extraction means for extracting motion information 
indicative of motions of the objects included in said code; and combining means for combining a plurality of objects 
based on an instruction from said control means and the motion information extracted by said extraction means. 
[0030] FurUier, another concern of the present invention is to provide data processing apparatus and method which 
decode code data, encoded by each of plural image objects, witin decoders of arbitrary coding specifications. 

35 [0031 ] Furtiier, anotha- concern of the present invention is to provide data processing apparatus and method which 
control the number of objects included in code data. 

[0032] According to an aspect of the present invention, there is provided a data processing apparatus for processing 
a data array to reproduce an image with a plurality of coded image objects, said apparatus comprising: detection means 
for detecting the number of image objects included in said data array; and control means for conto-olling the number of 
40 image objects included in said data array based on the number of image objects detected by said detection means. 
[0033] Furtiier, another concern of the present invention is to provide data processing apparatus and method which 
synthesize a plurality of code data, encoded by each of plural image objects, to obtain one code data based on a pre- 
determined coding standard. 

[0034] According to another aspect of the present invention there is provided a data processing apparatus compris- 
45 ing: input means for inputting a plurality of image data to construct one frame, wherein said image data respectively 
including N image objects, where N^l holds; and generation means for generating image data having M image objects, 
where holds, constructing said one frame, by integrating at least a part of said N image objects based on additional 
information indicative of relation among ttie image objects. 

[0035] Furtiier, another concern of the present invention is to provide data processing apparatus and method which 

50 decode synttiesized code data with decoders of aibiti-ary coding specifications. 

[0036] Furtiier, another concern of the present invention is to provide data processing apparatus and method which 
conti-d the number of objects Included in code data and/or the information amount of ttie code data. 
[0037] According to another aspect of ttie present invention, there is provided a data processing apparatus for 
processing a data array to reproduce one frame image with a plurality of coded image objects, said apparatus compris- 

55 ing: input means for inputting a plurality of data an-ays; insti-uction means for insti-ucting synttiesizing of a plurality of 
data arrays inputted by said input means; designation means for designating coding specifications of a processed data 
array; control means for controlling information amounts of the plurality of data an-ays inputted by said input means, 
based on the coding specifications designated by said designation means; and synttiesizing means for synthesizing ttie 
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plurality of data arrays with information amounts controlled by said control means, based on the coding specifications 
designated by said designation means.. 

[0038] Other features and advantages of the present invention will be apparent from the following description taken 
in conjunction with the accompanying drawings, in which like reference characters designate the same name or similar 
5 parts throughout the figures thereof. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0039] The accompanying drawings, which are incorporated in and constitute a part of the specifications, illustrate 
10 embodiments of the invention and, together with the description, serve to explain the principles of the invention. 

Fig. 1 is an example of the image processed by the MPEG4 coding; 
Fig. 2 is an explanatory view showing the objects of the image in Fig. 1 ; 
Fig. 3 is a block diagram showing the construction for coding in object units; 
IS Fig. 4 is a block diagram showing the construction for decoding an image encoded in object units; 
Fig. 5 is a block diagram showing the construction for decoding an image encoded in object units; 
Fig. 6 is an example of decoded image where an object is omitted; 

Fig. 7 is a block diagram showing the construction of a decoding apparatus according to the present invention; 
Fig. 8 is an example of 1 -frame code data; 
20 Fig. 9 is an example of synthesized code data; 

Fig. 10 is a block diagram showing the construction of an object combiner according to a first embodiment of the 
present invention; 

Figs. 1 1 A and 1 1 B are examples of object combining; 
Fig. 12 is an example of 1 -frame code data to be motion-compensated; 
25 Fig. 1 3 is an example of synthesized code data; 

Figs. UAto 140 are examples of objects of a still image and combined objects; 
Fig. 15 is an example of 1 frame of a moving image; 

Fig. 16 is a block diagram showing the construction of the object combiner according to a second embodiment of 
the present invention; 
30 Figs. 1 7A and 1 7B are examples of combined objects; 

Fig. 18 is an example of code data including combined objects; 

Fig. 19 is a block diagram showing the construction of the object combiner according to a third embodiment of tiie 
present invention; 

Fig. 20 is an example of input code data; 
35 Fig. 21 is an example of processed code data; 

Fig. 22 is a block diagram showing the construction of the object combiner according to a fourth emlxxliment of tiie 
present invention; 

Fig. 23 is a block diagram showing the construction of the object combiner according to a modification; 
Fig. 24 is a block diagram showing the construction of the object cont)iner according to another modification; 
40 Fig. 25 is an example of the structure of object code data; 

Fig. 26 is an example of tiie image constructed with a plurality of objects; 
Fig. 27 is an example of a node tree in the image in Fig. 26; 

Fig. 28 is an example of the profile table defining tiie number of objects and the bit rate upper limits by profile and 
level; 

45 Fig. 29 is a block diagram showing tiie construction of a moving image processing apparatus according to a fifth 
embodiment of the present invention; 

Fig. 30 is a block diagram showing the consti-uction of a profile and level regulator according to the fifth embodi- 
ment; 

Figs. 31 A and 31 B are examples of the sb-ucture of code data of moving image; 
so Fig. 32 is a block diagram showing the construction of the profile and level regulator according to a sixth embodi- 
ment of the present invention; 

Fig. 33 is a block diagram showing the construction of the profile and level regulator according to a seventh embod- 
iment of tiie present invention; 

Fig. 34 is a block diagram showing the consti-uction of an object integrator according to the seventh embodiment; 
55 Fig. 35 is an example of tiie sti-ucture of syntiiesized code data according to the seventii embodiment; 

Fig. 36 is a block diagram showing the construction of tiie object integrator according to a modification of tiie sev- 
entii embodiment; 

Fig. 37 is an example of syntiiesized color image information according to tiie seventii embodiment; 
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Fig. 38 is an example of synthesized masl< information according to the seventh embodiment; 

Fig. 39 is a bloci< diagram showing the construction of the object integrator according to an eighth embodiment of 

the present invention; 

Fig. 40 is an-example of a slice structure of color image information according to the eighth embodiment; 
5 Fig. 41 is a block diagram showing the construction of the profile and level regulator according to a ninth embodi- 
ment of the present invention ; 

Fig. 42 is an example of the structure of synthesized moving image code data according to the ninth embodiment; 
Fig. 43 is an example of the construction of an image represented by code data; 
Fig. 44 is an example of the construction of an image represented by code data; 
10 Fig. 45 Is a block diagram showing the construction of the moving image processing apparatus according to a tenth 
embodiment of the present invention; 
Figs. 46A to 46D are examples of images to be synthesized; 

Fig. 47 is a block diagram showing the construction of an image editing unit according to the tenth embodiment; 
Fig. 48 is an example of a synthesized image; 
IS Fig. 49 is a block diagram showing the detailed construction of a header processor; 

Figs. 50A to 50E are examples of code data of images to be synthesized and of a synthesized image; 
Fig. 51 is a flowchart showing image processing according to the tenth embodiment; 

Fig. 52 is a block diagram showing the construction of the image editing unit according to an eleventh embodiment 
of the present invention; 

20 Figs. 53A to 53D are examples of node trees showing the relation among respective objects; 
Fig. 54 is a block diagram showing the construction of a coding length regulator; 

Figs. 55 and 56 are block diagrams showing the constructions of the code length regulator according to modifica- 
tions of the eleventh embodiment; and 
Fig. 57 is an example of code data of a synthesized image. 

25 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

[0040] Preferred embodiments of the present invention will now be desaibed in detail in accordance with the accom- 
panying drawings. 

First Embodiment 

[Construction] 

35 [0041 ] Fig. 7 is a block diagram showing the construction of a decoding apparatus according to the present invention. 
Note that elements approximately corresponding to those in Figs. 3 and 5 have the same reference numerals and 
detailed explanations of the elements will be omitted. 

[0042] In Fig. 7. an object number regulator 40 includes an object counter 41 wrtnich counts the number s of objects, 
an object decoder counter 42 which counts the number d of object decoders, and an object combiner 43 which com- 
40 bines the plurality of objects included in the code data 8. Numeral 50 denotes a storage device comprising a magnetic 
disk or the like. 

[0043] The code data 8 inputted into the decoding apparatus is subjected to coding by an encoder as shown in Fig. 

3, for example. The code data 8 includes four objects at the maximum. Hereinbelow. desaiption will be made using a 

moving image frame as shown in Fig 1 as an original image. 
45 [0044] The code data 8 is inputted into the object number regulator 40 by each frame. When code data of a frame has 

been inputted, the object counter 41 counts the numbers of objects included in the code data. 

[0045] Fig. 8 is an example of 1 -frame code data. The code data has "Header" indicative of the attribute of the frame 

at its head, next, code data indicative of background object (Object 0 in Fig. 8). Then, code data of the respective 

objects, i.e., code data of the object 21 (object 1), the object 22 (Object 2) and the Object 23 (Object 3) follow. The code 
50 data of each object comprises Start code (SC) indicative of the header of the object. Location (Loc) code indicative of 

location of the object. Size code indicative of the size of the object. Shape code indicative of the shape of the object, 

and Texture code indicative of the object itself. 

[0046] In the following description, the Shape code is binary-encoded by MR coding, and the Texture code is encoded 
by block-coding. Note that block encoding is dividing an object into, e.g., 8x8 pixel blocks, then performing discrete 
55 cosine transformation (DCT) on each iDlock, and quantizing and encoding the obtained conversion coefficients (DCT 
coefficients), such as JPEG coding. Fig. 8 shows the Texture code of the Object 1 . The Texture code is a set of block- 
based code, DCT-COEFs. The code DCT-COEF is obtained by one-dimensionally rearranging quantization values of 
DCT coefficients and encoding quantization values other than zero run-length and zero value. If all the quantization val- 
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ues are zero, no DCT-COEF is generated. 

[0047] The count value s of the object counter 41 is reset to zero upon start of input of 1 -frame code data. Then, the 
number of occurrence of SC indicative of the header of object of the code data is counted. The result of counting is 
inputted into the object combiner 43. The object decoder counter 42 counts the number d of object decoders. In the 
5 present embodiment, as the number of object decoders is three, accordingly, the output from the object decoder coun- 
ter 42 is "3". 

[0048] Fig. 1 0 is a block diagram showing the construction of the object combiner 43. A terminal 61 inputs the code 
data 8. A terminal 62 inputs the number s of the objects from the object counter 41 . A terminal 63 inputs the number d 
of the object decoders from the object decoder counter 42. 

10 [0049] A code memory 64 is used for storing code data for one or more frames inputted from the terminal 61 . A loca- 
tion information extractor 65 extracts the Loc code from the code data stored in the code memory 64, and stores the 
extracted Loc code into a location information memory 66 in frame units. An object distance calculator 67 calculates 
distances between respective objects, based on the Loc code stored in the position location memory 66. 
[0050] A distance comparator 68 compares the distances calculated by the object distance calculator 67, and selects 

15 objects to be combined, tased on the result of comparison. A selector 69 outputs object code data, read from the code 
memory 64, to a code divider 70 or terminal 76. designated by the distance comparator 68 as an output destination, for 
each object. 

[0051 ] The code divider 70 divides object code data into Loc, Size, Shape and Texture code. A location code combiner 
71 combines the Loc code of two objects into one Loc code. A size code combiner 72 combines the Size code of two 
20 objects into one Size code. A shape code combiner 73 confines the Shape code of two objects into one Shape code. 
A texture code combiner 74 combines the Texture code of two objects into one Texture code. A code synthesizer 75 
synthesizes outputs from the location code combiner 71 to the texture code combiner 74 into one code data. 
[0052] One of the output from the code synthesizer 75 and that from the selector 69 is fonwarded to the next stage via 
a terminal 76. 

25 

[Operation] 

[0053] Next, the operation of the present embodiment will be described on a case where an MPEG Intra frame or 
respective Motion JPEG frames are independently encoded. 

30 

• Frame-Based Coding 

[0054] In Fig. 10, code data for one or more frames is stored via the terminal 61 into the code memory 64, and the 
number s of objects is inputted via the terminal 62 from the object counter 41. In case of the code data in Fig. 8. the 
35 number s of objects is four (s=4). Further, the number d of object decoders is inputted via the terminal 63 from the object 
decoder counter 42. In the present embodiment, the number d of object decoders is three (d=3). Accordingly, s-d=1 
holds, i.e.. the construction has one less decoder. 

[0055] The object distance calculator 67 obtains the distance between the object 21 and the object 22 from the Loc 
code stored in the location information memory 66. If the location of the object 21 is (xl ,y1) and that of the object 22 is 
40 (x2,y2), the distance D12fDetween these objects is represented by the following equation: 

D12 = V{(x1-x2)2+(y1-y2)^} (1) 

[0056] Similarly, the distance D13 between the abject 21 and the object 23 and the distance D23 between the object 
45 22 and the object 23 are obtained. Based on the obtained distances between objects, the distance comparator 68 
selects a plurality of objects with a short distance therebetween, to combine the objects for compensation of the short- 
age of object decoder. For example, the distance comparator 68 selects a plurality of objects with the smallest sum of 
distance therebetween in a plurality of frames. In the present embodiment, as the shortage of object decoder is "1 ", tvro 
objects are combined into one object. If the sum of distance D12 is the smallest, the object 21 and the object 22 are 
50 combined. Thus, the shortage of object decoder is resolved. 

[0057] If the object 21 and the object 22 are combined, the selector 69, controlled by the output from the distance 
comparator 68, sends the header oulputted from the code memory 64 to the terminal 76, and sends the Object 0 as 
background object to the terminal 76. 

[0058] Next, the output from the distance comparator 68 for the Object 1 corresponding to the object 21 indicates 
55 "selection", accordingly, the selector 69 sends the Object 1 to the code divider 70. The Loc code of the Object 1 is sent 
to the location code combiner 71 , the Loc code and Size code are sent to the size code combiner 72. the shape code 
is sent to the shape code combiner 73, and the Texture code is sent to the texture code combiner 74. Next, the output 
from the distance conparator 68 for the Object 2 corresponding to the object 22 also indicates "selection", accordingly, 
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the Object 2 code is divided into the respective code, and the divided code are inputted into the location code combiner 
71 to the texture code combiner 74, as in the case of the Object 1 . 

[0059] Note that as the output from the distance comparator 68 for the Object 3 corresponding to the object 23 indi- 
cates "non-selection", the code data of the object 23 is outputted to the terminal 76 without any processing. Further, for 
5 a frame having the code data 8 where s-d>0 holds, object combining is not performed, and the output from the selector 
69 Is fonwarded to the terminal 76. 

[0060] The location code combiner 71 decodes the respective Loc code, and obtains location information (x1 ,y1) to 
(xn,yn) of the plurality of objects. Then, as represented by the following equation, the location code combiner 71 selects 
the minimum value of x- and y-coordinates from these location information, and outputs new location information (x1 ' 

10 

(x1 •,y1 •) = (min(x1 ,x2 xn), min(y1 ,y2 yn)) (2) 

n: the number of combined objects 

IS 

[0061 ] The size code combiner 72 decodes the respective Loc and Size code, and obtains location information and 
size information (x1 ,y1), (Sxl ,Sy1 ) to (xn.yn), (Sxn,Syn) of the plurality of objects. Then, the size code combiner 72 cal- 
culates new location information (xl'.yV) from the equation (2). and obtains new size information (SxV.SyV) from the 
following equation and outputs the information. 

20 

(Sxl •,Sy1 ■) = (max(x1 +Sx^ ,x2+Sx2 xn+Sxn)-x1 ', max(y1 +Sy 1 ,y2-hSy2 yn+Syn)-y1 ") (3) 

[0062] The shape code combiner 73 generates code synthesized from the shapes of the plurality of objects. When 
the objects 21 and 22 are synthesized, the shape of a new object is represented by a mask 80 as shown in Fig. 1 1 A. 
25 The original masks 24 and 25 remain the same, and the portion other than the masks are newly added. In Fig. 1 1 A, the 
value of the hatched portion is the same as that of the solid black masks 24 and 25. Accordingly, as zero-run has 
increased on the right side of the mask 24, zero run-length is added after code indicative of a change point nearest to 
the frame right end. 

[0063] Further, if another object does not exist on the right side of the object 21 , the above-described change point 
30 merely indicates the final change point of the line, and the code does not increase. On the other hand, if another object 
exists on the right side of the object 21. zero run-length con-esponding to the number of pixels between both objects is 
added to the code. That is, the code can be replaced with code to which zero run-length is added. Further, if there is a 
third object on the right side of the other object on the right side of the object 21 . the code of the object 21 is replaced 
with code where the zero run-length con-esponding to the interval between the objects have been added the code of 
35 the object 21 . The replaced code is outputted as new Shape code. Note that with respect to a line including no object, 
almost no code is generated. 

[0064] The texture code combiner 74 generates code synthesized from textures of the plurality of objects. Fig. 1 1 B 
shows a status where the texture of the object 21 and that of the object 22 are synthesized. The texture of a new object 
is represented as an object 81 . The original objects 21 and 22 remain the same, and a hatched portion other than the 

40 objects is newly added. Note that the value of the hatched portion is zero. In the MPEG coding or the like, the DC com- 
ponent of a pixel of interest is converted into a difference between the DC component and that of a left block. Further, 
quantization values of AC components are one-dimensionally arrayed, and zero run-length and nonzero values are 
encoded. In the hatched portion in Fig. 1 1 B, the difference between the DC component of a pixel of interest and that of 
a left block is zero, and the values of all the AC components are zero. In this case, in the MPEG1 coding, in macro-block 

45 units, 1 bit indicative of macro block type, luminance 12 bits and chromatidty 4 bits indicative of DC component size, 
and BOB (End of Block) 12 bits indicative of the end of the block, i.e., total 29 bits, are added. In this manner, Texture 
code of the object 81 where the textures of the plurality of objects are combined is generated, and the Texture code is 
outputted. 

[0065] The code synthesizer 75 synthesizes outputs from the location code combiner 71 to the texture code combiner 
50 74, to generate code data of the combined object, and outputs the code data to the terminal 76. 

[0066] Fig. 9 shows an example of code data synthesized as above. The code data Object 1 of the object 21 and the 
code data Object 2 of the object 22 in Fig. 8 are combined into code data Object 1 '. Note that the code data Object 3 of 
the object 23 remains the same. 

[0067] The code data processed as above is inputted into the demultiplexer 9, and the object code data is divided into 
55 Object 0, Object 1' and Object 3. Then, the code data Object 0 is inputted into the object decoder 10; the code data 
Object 1 ' is inputted into the object decoder 1 1 ; and the code data Object 3 is inputted into the object decoder 12. The 
respective object decoders output location information obtained by decoding the code data and the image data, to the 
object compositer 14. The object compositer 14 arranges the image data in accordance with the location information of 
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the respective objects, to obtain a reproduced image 15. 
•IVIoving Image Coding 

5 [0068] In moving image coding, the coding efficiency is improved by motion compensation. As an example, coding by 
using the correlation between frames such as a predicted frame in MPEG standard will be described. 
[0069] Fig. 1 2 is an-example of code data 8 for 1 frame to be motion-corrpensated. Similar to the code data in Fig. 8, 
the code data in Fig. 12 has a header, code data Object 0 indicative of background object, and code data (Object 1 to 
Object 3) of respective objects. Each code data comprises the SC indicative of the header of the object, the Loc code 

10 indicative of location, the Size code indicative of size, the Shape code indicative of shape and Texture code indicative 
of texture. In the MEGP coding, an object is divided into macro blocks and motion compensation is performed in block 
units. As a result. The Texture code comprises MV code indicative of motion vector as a result of motion vector coding, 
and DCT-COEF quantized and encoded from the result of block-based division and DCT conversion. 
[0070] In moving image coding, the object number regulator 40 performs similar processing to that in frame-based 

15 coding as described above. Although detailed explanation will be omitted, if the shortage of object decoder is "1" and 
the distance D12 is the shortest, the objects 21 and 22 are combined, to reduce the number of objects. Further, the 
motion vectors included in the respective objects and the result of coding of DCT coefficients are not changed. 
[0071] In the MPEG coding or the like, when frame con-elation is utilized, the DC component and AC components of 
predicted difference are encoded. Further, a motion vector is converted to the difference between the motion vector and 

20 that of a left macro block. Accordingly, in the hatched portion in Fig. 1 1 B, the difference of the motion vector is zero, and 
the values of all the AC components are zero. In the MPEG coding, such macro block is not encoded and is skipped, 
and corresponding code merely indicates the number of skipped macro blocks. Accordingly, in the object combining, 
the code of macro block included in an object which appears next is merely changed, and the code is merely slightly 
changed. In this manner, the Texture code of combined objects is generated and outputted. 

25 [0072] Rg. 13 is an example of code data synthesized as above. The code data Object 1 of the object 21 and the 
code data Object 2 of the object 22 in Fig. 12 are combined into code data Object 1'. Note that the code data Object 3 
of the object 23 remains the same. 

[0073] As described above, in the present embodiment, in a moving image encoded in object units, if the number of 
coded objects is greater than that of decoders. I.e., there is shortage of decoders, to reduce the number of objects in 

30 con-espondence with the shortage, a plurality of objects with a short distance therebetween are combined. This enables 
efficient and proper reproduction of nraving image including a number of coded objects, with a limited number of decod- 
ers. Further, the objects are synthesized in code data status. That is, as the combining is made by change or addition 
of code, the objects can be synthesized at a high speed, and further, increment in code length is very small. 
[0074] Further, in the above description, a coded moving image is decoded, however, a coded still image can be sim- 

35 ilarly processed. That is, the above-described frame-based decoding can be applied to still image decoding. For exam- 
ple, as shown in Fig. 14A, an image 90 includes character areas 91 and 94 and photographic areas 92 and 93, and the 
characters are encoded by the MMR (Modified ModHied Read) coding and the photographs are encoded by the JPEG 
coding. If only one decoder for the MMR coding and only one decoder for the JPEG coding are prepared, the image 90 
can be decoded by combining and dividing the respective areas into objects 95 and 96 as shown in Figs. 14B and 14C. 

40 [0075] Note that in the above description, the Shape code is encoded by the MMR coding, and the Texture code is 
encoded by the MPEG coding, however, the present invention is not limited to these coding methods. Further, the func- 
tion of the demultiplexer 9 may be incorporated into the object number regulator 40. Further, if s-ds2 holds as the dif- 
ference between the number d of object decoders and the number s of objects included in code data, objects with a 
short distance therebetween, i.e., 2 • (s-d) objects may be combined with (s-d) objects, or (s-d+1 ) objects with the 

45 shortest distance therebetween may be combined into one object. 

[0076] In the above description, code data of regulated number of objects are inputted into the object decoders. How- 
ever, if it is arranged such that the code data of regulated number of objects are temporarily stored in the storage dmce 
50 as shown in Fig. 7, and the code data of regulated number of objects are read from the storage device 50 and 
decoded, decoding processing can be performed at a speed higher than that in decoding with object combining. 

50 [0077] Further, according to the present embodiment, the number of object decoders is not limited. Accordingly, the 
processing capability can be easily improved by inaeasing the object decoders. Further, the location of an object may 
be obtained precisely by decoding, e.g., the Shape code, as well as utilizing the location code. 

Second Emljodiment 

55 

[Construction] 

[0078] Fig. 1 6 is a block diagram showing the construction of the object combiner 43 according to a second embodi- 
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ment of the present invention. In Fig. 16, elements corresponding to those in the construction of Fig. 10 have the same 
reference numerals, and detailed explanations of the elements will be omitted. 

[0079] Numerals 1 1 1 and 1 12 denote code memories having a similar function to that of the code memory 64; 1 13 
and 114, location information memories having a similar function to that of the location information memory 66; and 

5 1 1 5, an object motion calculator which detects the movement of object. 

[0080] Numeral 1 1 6 denotes a determination unit 1 1 6 which determines whether or not object combining is necessary 
and determines objects to be combined, based on the results of calculation by the object motion calculator 1 15 and the 
object distance calculator 67, and the number s of objects and the number d of object decoders inputted from the ter- 
minals 62 and 63. Numeral 1 1 7 denotes a selector which outputs object code data read from the code memory 1 1 1 , or 

10 from the code memory 1 1 2, if necessary, to an output destination designated by the determination unit 1 1 6. 

[Operation] 

[0081 ] Next, the operation of the object combiner 43 as shown in Fig. 1 6 will be described. First, frame-based coding 
15 to independently encode respective frames such as Intraframe coding in the MPEG standard or Motion JPEG coding 
will be described. 

[0082] In tiiis case, the code memory 1 1 1 has the same function as that of the code memory 64 in Fig. 1 0; the location 
information memory 1 1 3 has the same function as that of ttie location information memory 65 in Fig. 10; and the deter- 
mination unit 1 16 has the same function as that of the distance comparator 68 in Fig. 10. Accordingly, the code data for 

20 1 frame inputted from the terminal 61 is stored into the code memory 111. The number s of objects, as the output from 
the object counter 41 , is inputted into tiie terminal 62. The number d of object decoders, as the output from the object 
decoder counter 42, is inputted into the terminal 63. The location information extractor 65 extracts location information 
of respective objects from the code data stored in the code memory 1 1 1 , and inputs the extracted information into the 
location information memory 1 13. The object distance calculator 67 calculates distances between objects based on the 

25 location information stored in the location information memory 1 1 3. The determination unit 1 1 6 determines whether or 
not object combining is necessary from the number s of objects and from the number d of object decoders. If object 
combining is necessary, the determination unit 1 16 determines ttie number of objects to be combined, then compares 
the distances between objects obtained by the object distance calculator 67, and determines ttie necessary numtier of 
objects to be combined. 

30 [0083] The object code data read from ttie code memory 1 1 1 is inputted into the selector 1 1 7. The selector 1 1 7 for- 
wards code data of header, background object and uncombined objects to tine terminal 75. On ttie other hand, ttie 
selector 117 inputs code data of objects to be combined to tiie code divider 70. The code data inputted into the code 
divider 70 is divided into location, size, shape and texhjre code data, and inputted into the location code combiner 71, 
the size code combiner 72, the shape code combiner 73 and the texture code combiner 74, respectively Object code 

35 data, combined in a procedure similar to ttiat described in ttie first embodiment, is outputted from ttie terminal 76. 
[0084] Next, a frame encoded by using ttie con-elation between frames such as a predicted frame in ttie MPEG cod- 
ing, will be described. In this case, the MPEG-coded frame 20 in Fig. 1 and a frame 100 shown in Fig. 15, following ttie 
frame 20, will be described. Note that in the frame 100, the object 21 (helicopter) has moved rightward, and the object 
22 (tain) and the object 23 (car) have moved leftward, with respect to the frame 20. 

40 [0085] Prior to processing, the code memories 1 1 1 and 1 1 2 and the location information memories 1 1 3 and 1 1 4 are 
cleared, and the other elements are initialized. The number s of objects is inputted into ttie terminal 62, and ttie number 
d of object decoders is inputted into the terminal 63. First, the code data of ttie frame 20 is inputted into ttie terminal 61 , 
and stored into ttie code memory 111. The location information extractor 65 stores location information of the respective 
objects in the frame 20 into ttie location information memory 1 13. The object distance calculator 67 obtains distances 

45 between ttie respective objects in ttie frame 20. 

[0086] Next, the code data in the code memory 1 1 1 is moved to ttie code memory 1 1 2, and the location information 
in the location information memory 1 13 is also moved to ttie location information memory 114, then the code data of 
the frame 100 is inputted into the terminal 61 and stored into the code memory 1 1 1 . The location information extractor 
65 stores location information of ttie respective objects in ttie frame 100 into ttie location information memory 113. 

so [0087] The object motion calculator 1 15 calculates the motions of ttie respective objects from ttie locations of ttie 
respective objects in the location information memories 1 13 and 1 14. In case of the object 21 , assuming ttiat its location 
in the frame 20 is (x21 1 ,y21 1) and that in the frame 100 is (x212.y212), the motion vector MV21 = (mv21x,mv21y) is 
represented as: 

55 MV21 =(mv21x,mv21y) = ((x212-x211),(y212-y211)) (4) 

[0088] Regarding ttie objects 22 and 23, motion vectors MV22 and MV23 are obtained in a similar manner. 

[0089] The distances D12, D13 and D23 as ttie outputs from the object distance calculator 67. and the motion vectors 
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MV21 , MV22 and MV23 as outputs from the object motion calculator 1 15 are inputted into the determination unit 1 16. 
The determination unit 1 1 6 determines whether or not object combining is necessary, from the number s of objects and 
the number d of object decoders, and if the object combining is necessary, determines the number of objects to be com- 
bined and objects to be combined. 
5 [0090] In this case, objects having motion vectors with directions close to each other are determined as objets to be 
combined. Then, the difference vectors between the motion vectors of the respective objects are obtained, and motion 
vector(s) less than a threshold value Thdv is selected. That is, the difference vector DV2122 between the motion vector 
MV21 of the object 21 and the motion vector MV22 of the object 22 is represented by the following equation: 

10 DV2122 = (dv2122x,dv2122y) = ((mv21x-mv22x),(mv21y-mv22y)) (5) 

[0091] The size D2122 of the difference vector DV2122 is represented by the following equation: 

D2122 = V(dv2122x^ +dv2122y^) (6) 

75 

[0092] All the difference vector sizes are obtained. The obtained difference vector sizes D2122. D2223 and D2123 
are compared with the threshold value Thdv, and the difference vector size(s) less than the threshold value is selected. 
As the objects 22 and 23 have moved in the same direction, the difference vector size D2223 of the difference vector Is 
less than that with respect to the object 21 . If only the difference vector size D2223 is less than the threshold value Thdv, 

20 the objects to be combined are the objects 22 and 23. If all the difference vector sizes are less than the threshold value 
Thdv, objects with the shortest distance therebetween are selected as objects to be combined. Further, if there is no 
difference vector size less than the threshold value, objects with the shortest difference therebetween are combined. 
[0093] Then, object combining is performed based on the determination. In this case, the object 22 and the object 23 
are combined so as to reduce the number of objects. This operation will be described using the code data 8 in Fig. 8 as 

25 an example. 

[0094] First, the selector 1 1 7 reads the header from the code memory 1 1 2 and outputs the header via the terminal 
76. Further, the selector 117 reads the code data Object 0 of the baclground object, and outputs the code data via the 
terminal 76. As the object 21 is not combined, the selector 1 17 similarly reads the code data Object 1 and outputs the 
code data via the terminal 76. 

30 [0095] Then, as the next code data Object 2 coresponds to the object 22, the selector 1 1 7 reads the code data Object 
2 from the code memory 1 12 and inputs the code data into the code divider 70. The code divider 70 inputs the Loc code 
from the object code data into the location code combiner 71 , the Loc code and the Size code from the object code data, 
into the size code combiner 72, the Shape code from the object code data, into the shape code combiner 73, and the 
Texture code from the object code data, into the texture code combiner 74. 

35 [0096] Next, the selector 1 1 7 reads the code data Object 3 of the object 23 to be combined with the object 22, from 
the code memory 112, and inputs the code data into the code divider 70. As in the case of the code data Object 2, 
divided code are respectively outputted. 

[0097] The location code combiner 71 decodes the respective Loc code, and generates new location information 
(x2',y2') from the location information (x2,y2) and (x3,y3) of the two objects, based on the equation (2), then encodes 
40 the new location information (x2',y2'), and outputs the coded location information. 

[0098] The size code combiner 72 decodes the Loc code and the Size code, then generates new size information 
{Sx2',Sy2') from the location information (x2,y2) and (x3,y3) and size information (Sx2,Sy2) and (Sx3,Sy3) of the two 
objects, based on the equation (3), then encodes the new size information (Sx2',Sy2'), and outputs the coded size infor- 
mation. 

45 [0099] The shape code combiner 73 generates code of a shape synthesized from the shapes of the two objects. 
When Uie objects 22 and 23 are synthesized, the shape of the new object is represented by a mask 150 in Fig. 17A. 
That is, in Fig. 1 7A, a hatched portion is newly added to a mask 25 of the object 22 and a mask 26 of the object 23, as 
the mask 1 50. Note that the value of the hatched portion is tine same as that of the mask 80 in Fig. 1 1 A. Then, as in the 
case of Uie first embodiment, addition of zero-run and/or code change is performed, and tiie obtained code is outputted 

50 as new Shape code. 

[01 00] The texture code combiner 74 generates code of texture synthesized from the textures of the two objects. Rg. 
1 7B shows a status where the texture of the object 22 and tiiat of the object 23 are synthesized. That is, a texture having 
zero value, as represented as a hatched portion, is added to the textures of the objects 22 and 23. Then, as in the case 
of the first embodiment, code is added in macro block units in the hatched portion or the number of skipped blocks is 
£5 changed, thus Texture code of an object 151 is generated and outputted. 

[01 01 ] The code synthesizer 75 synthesizes ttie outputs from the location code combiner 71 to the texture code com- 
biner 74, to generate code data of the combined object. The code data of the combined object is outputted from the 
terminal 76. 
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[0102] Fig. 18 is an example of code data including the combined object. In Fig. 18, the code data Object 1 of the 
object 21 remains the same, while the code data Object 2 of the object 22 and the code data Object 3 of the object 23 
are combined as code data Object 2". 

[0103] The generated code data is inputted into the demultiplexer 9. and divided into the code data Object 0, the 
5 Object 1, and the Object 2'. The code data object 0 is inputted into the object decoder 10; the code data Object 1 is 
inputted into the object decoder 1 1 ; and the code data Object 2' is inputted into the object decoder 12. The object 
decoders 10 to 12 decode the code data, generate location information and image data of the respective objects, and 
output them to the object composlter 14. The object compositer 14 arranges the image data in accordance with the 
location information of the respective objects, thus obtains the reproduced image 15. 
10 [0104] In the present embodiment, in a moving image encoded in object units, if the number of coded objects is 
greater than that of decoders, objects are combined, from objects with motion vectors or moving amounts close to each 
other, whereby original image reproduction can be efficiently made even by a limited number of decoders. Further, as 
change or addition of code is performed in the form of code data, the processing can be made at a high speed, and 
increment in code length is very small. Further, as the decoding load on the respective decoders can be uniformed. Fur- 
75 ther, even if the currently-processed frame is not a frame encoded by the Intraframe coding, upon occun'ence of scene 
change, objects to be combined In interframe coding can be re-determined. 

[01 05] In the present embodiment, the difference vector size and the distance between objects are used for determi- 
nation of objects to be combined, however, the determination may be made by only using the difference vector size. Fur- 
ther, in the present embodiment, the Shape code is encoded by the MMR coding, and the Texture code is encoded by 

20 the MPEG coding, however, the present invention is not limited to these coding methods. 

[0106] Further, in the present embodiment, the function of the demultiplexer 9 may be incorporated into the object 
combiner 40. Further, tiie number of object decoders and the number of objects included in code data are not limited 
to those in the embodiment. As long as (s-d)^2 holds, 2 • (s-d) objects can be combined to (s-d) objects, from objects 
witii the minimum difference vector size, or (s-d+1 ) objects can be combined into one object from objects with the min- 

25 imum difference vector size, or combination between the former and latter cases may be employed. 

[0107] In the present embodiment, the decoding apparatus having decoders to outputs decoded results has been 
described, however, if it is arranged such that code outputted from the object combiner 43 is temporarily stored into the 
storage device 50, and the code read out of the storage device 50 is decoded, object combining is unnecessary, and 
high-speed decoding (image reproduction) is possible. 

30 [01 08] Further, in the present embodiment, as the number of object decoders can be freely set, the number of object 
decoders can be easily increased so as to improve processing capability. Further, the motion calculation may be made 
by refen-ing to the motion vectors of objects as well as referring to the location information of the objects. 

Third Embodiment 

35 

[01 09] Fig. 1 9 is a block diagram showing the construction of the object combiner 43 according to a third embodiment 
of the present invention. In Fig. 19, elements corresponding to those in Fig. 10 have the same reference numerals, and 
detailed explanations of the elements will be omitted. 

[01 1 01 A code length extractor 200 exti-acts code lengths of respective objects of code data stored in the code memory 
40 64, and stores the extracted code lengths into a code lengtii memory 201 . A code length comparator 202 compares ttie 
respective code lengths of the objects, stored in the code length memory 201 , with each other, then determines whether 
or not object combining is necessary, and determines objects to be combined. 

[01 1 1 ] If object combining is performed, objects to be combined are determined, sequentially from objects with short 
code lengths. For example, if the number s of objects is four (s=4), and the number d of object decoder is ttiree (d=3), 

45 s-d=1 holds, accordingly, two objects having short code lengths are combined into one object. If the code data of the 
frame 20 in Fig. 1 is as shown in Fig. 20, the code data Object 2 of tiie object 22 is the minimum code data, and the 
code data Object 1 of Uie object 21 is tiie next minimum code data. In this case, the code data Object 1 and Object 2 
are combined. The operations of ottier elements are the same as those in the above embodiments. From the terminal 
76, code data as shown in Fig. 21 is outputted. 

50 [0112] Then, the code data Object 1 and Object 2 as tiie code data of the object 21 and ttie object 22 are combined 
in all the frames. The details of ttie combining are as described in the above respective embodiments. Even a motion- 
compensation frame or a frame encoded by the Intraframe-coding of tiie MPEG coding or ttie like are included in ttie 
image data, ttie object combining is made in a similar manner to that in ttie above respective embodiments. 
[01 1 3] According to the present embodiment, similar advantages to ttiose in the above respective embodiments can 

55 be obtained. Further, in case of still image as shown in Fig. 1 4A, a character image is encoded at a high compression 
rate by ttie MMR coding, and ttie resulting code length is short. Accordingly, if an image where character portions are 
combined as shown in Fig. 1 4B is handled as one object, similar advantages to Uiose as above can be obtained in ttie 
still image in Fig. 14A. 
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Fourth Embodiment 

[01 14] In the MPEG coding or the like, a frame within which coding is performed (a frame encoded by the Intraf rame- 
coding) and a frame encoded by using interframe correlation (a frame encoded by the Interframe-coding) are treated. 
5 The frame encoded by the Intraframe-coding is inserted to ensure synchronism or to prevent accumulation of DCT dif- 
ferences. 

[01 1 5] The present embodiment re-determines objects to be combined upon coding of Intraframe-coding frame. 
[01 1 6] Fig. 22 is a block diagram showing the construction of the object combiner 43 according to a fourth embodi- 
ment of the present invention. In Fig. 22, elements corresponding to those in Fig. 10 have the same reference numerals, 

10 and detailed explanations of the elements will be omitted. 

[0117] Numeral 301 denotes a header analyzer which analyzes the header of each frame. Numeral 302 denotes a 
distance comparator having approximately the same operation as that of the distance comparator 68 in Fig. 10. As in 
the case of the first embodiment, prior to processing, the number s of objects and the number d of object decoders are 
inputted. If s^ holds, code inputted into the terminal 61 is outputted from the terminal 76 without any processing. 

15 [01 1 8] On the other hand, if sxi holds, the header of code data for 1 frame, inputted from the terminal 61 and stored 
into the code memory 64, is inputted into the header analyzer 301 . In the header with desaiption of frame attribute, 
information indicating whether or not the frame is a frame encoded by the Interframe-coding, i.e., a frame to be-encoded 
by using interframe correlation, is described. For example, the MPEG coding handles an I frame which Is encoded 
writhin the frame without interframe correlation (by Intra coding), and a P and B frames encoded by using Interframe cor- 

20 relation with motion-compensation. 

[01 1 9] When a frame encoded without interframe correlation is detected from the result of header analysis, the oper- 
ation of the present embodiment is as follows. Code data is read out of the code memory 64. The location information 
extractor 65 exb-acts the Loc code following the SC of respective objects, and stores the extracted Loc code into tiie 
location information memory 66. The object distance calculator 67 obtains distances between the objects, and the dis- 

25 tance comparator 302 selects, sequentially from objects with the shortest distance therebetween. Note that the proce- 
dure of selection is similar to that of the first embodiment. Information indicative of the selected objects are held in the 
distance comparator 302. 

[01 20] The information indicative of tiie selected objects held in the distance comparator 302 Is updated only if a new 
instruction is inputted from the header analyzer 301, i.e., only if a frame encoded without interframe con-elation has 
30 been newly detected. 

[0121] On the other hand. In a frame encoded by using interframe correlation, object combining Is performed in 
accordance with information indicative of selected objects held in the distance comparator 302, and code of a new 
object obtained by combining objects is outputted from the terminal 76, as in tine case of tiie first embodiment. 
[01 22] In this manner, according to the present embodiment, objects to be combined are re-determined upon decod- 

35 ing of frame encoded by tiie Intraframe-coding, whereby change of coding efficiency by object combining can be sup- 
pressed. Even if a frame encoded by tiie Intraframe-coding is not detected, when scene change occurs, objects to be 
combined are re-determined, even with objects encoded by using interframe correlation. Regarding scene change, in 
a P frame, for example, if the number of macro blocks to be Inti-a-encoded is large, or in a B frame, if a frame where 
motion vectors are refen-ed to greatly depends on rts pervious or subsequent frame, it is determined that scene change 

40 has occurred. 

[0123] According to the fourth embodiment, as in the case of the first embodiment, by re-determining objects to be 
combined in a frame encoded by ttie Inti-aframe-coding, change of coding efficiency due to object combining can be 
suppressed. 

45 Modifications of First to Fourth Embodiments 

[0124] As shown in Fig. 23, the header analyzer as described in ttie fourth emtxxJiment can be added, as a header 
analyzer 401 , to the construction in Fig. 1 6 of ttie second embodiment. That is, as a result of frame header analysis, if 
it is determined that ttie frame has been encoded without interframe con-elation, a determination unit 402 determines 
50 objects to be combined, based on distances between objects outputted from the object distance calculator 67 and 
motions of objects outputted from the object motion calculator 1 15, as in ttie case of the second embodiment. Informa- 
tion indicative of objects to be combined is held in ttie determination unit 402, and only if an instruction is inputted from 
the header analyzer 401 , ttie held content is updated. 

[0125] As shown in Fig. 24, ttie header analyzer as described in the fourtii embodiment can be added, as a header 
55 analysis 501 , to ttie construction in Fig. 19 of the third embodiment. That is, as a result of frame header analysis, if it is 
determined ttiat ttie frame has been encoded wittiout interframe correlation, a code length comparator 502 determines 
objects to be combined, based on code lengths of respective objects, as in ttie case of ttie third embodiment. Informa- 
tion indicative of objects to be combined is held in the code lengtti comparator 502, and only if an instruction is inputted 



13 

Copied from 09964647 on 02/18/2005 



EP 0 954 181 A2 



from the header analyzer 501, the held content is updated. 

[0126] According to the constructions In Figs. 23 and 24, objects to be combined are re-determlned upon decoding 
of frame encoded by the Intraframe-coding, whereby change of coding efficiency due to object combining can be sup- 
pressed. 

5 [01 27] Further, in the MPEG4 standard, handling of sound data as an object is studied. If a distance between sound 
sources of sound objects is regarded as a distance between objects, the first embodiment Is applicable, and If the 
movement of sound source Is object motion, the second embodiment is applicable. In use of code lengths of respective 
objects, the third embodiment is applicable. Hius, the above-described respective embodiments are applicable to cod- 
ing of sound including audio Information. 

10 [0128] As described above, the first to fourth embodiments provide decoding apparatus and method which decode 
all the objects even if the number of decoders is limited. 

[0129] Further, the embodiments provide decoding apparatus and method which decode a coded still Image without 
degrading the image quality even if the number of decoders Is limited. 

[01 30] Further, the embodiments provide decoding apparatus and method which decode a coded moving Image with- 
15 out degrading the Image quality even if the number of decoders is limited. 

Fifth Embodiment 

[Construction] 

20 

[0131] Fig. 29 is a block diagram showing the construction of a moving image processing apparatus according to a 
fifth embodiment of the present invention. In the present embodiment, the MPEG4 coding Is used as a moving Image 
coding method. Note that the coding method of the present embodiment is not limited to the MPEG4 coding, but any 
other coding method can be employed as long as it respectively encodes a plurality of objects within an image. 

25 [0132] In Fig. 29, numeral 1201 denotes an encoder which inputs a moving image and encodes the image by the 
MPEG4 coding of Ckjre profile and level 2. Numeral 1202 denotes a storage device used for storing coded moving 
image data. The storage device 1202 comprises a magnetic disk, an magneto-optic disk or the like. As the storage 
device 1202 Is renrovably attached to the moving Image processing apparatus, coded moving image data can be read 
in another apparatus. Numeral 1203 denotes a transmitter which transmits encoded moving image data to a LAN or a 

30 communication line, and performs broadcasting or the like; 1204, a receiver which receives code data outputted from 
the transmitter 1203; 1 205, a profile and level regulator to which the present invention is applied; 1 206, a storage device 
used for storing output from the profile and level regulator 1 205; 1 207, a decoder which decodes code data encoded by 
the MPEG4 coding of Core profile and level 1 ; and 1208, a display unit which displays a moving Image decoded by the 
decoder 1207. Note that as described above, the encoder 1201 performs coding of Core profile and level 2, and in ttiis 

35 example, to simplify the explanation, the encoder 1 201 performs coding at a bit rate of 384 Kbps. 

[0133] Fig. 43 shows an example of an Image to be encoded. In Fig. 43, respective numerals denote objects. An 
object 2000 represents background; an object 2001, a balloon moving in ttie air; an object 2002, a bird; objects 2003 
and 2004. a woman and a man. 

[0134] Fig. 31 A shows a bit stream when ttie image in Fig. 43 is encoded. The bit stream has an arrangement infor- 
40 mation a indicative of location information of objects 2000 to 2004 at its head. Actually, the an-angement information a 
is encoded In BIFS (Binary Format for Scene description) language to describe scene construction information, and the 
arrangement infornration a Is multiplexed. Then, VOSSC. Visual Object data a-1 , a-2, a-3 and VOSEC follow. The code 
data in Fig. 31 A is stored into the storage device 1202 or transmitted via the transmitter 1203. The code data is inputted 
via the storage device 1202 or the receiver 1204, Into Vne profile and level regulator 1205 as a characteristic element of 
45 the present invention. The profile and level regulator 1205 also inputs the status of the decoder 1207. 

[0135] Fig. 30 is a block diagram showing the detailed construction of the profile and level regulator 1205. In Fig. 30, 
numeral 1 101 denotes the code data shown in Fig. 31A; 1 102, a separator which separates the code data 1 101 Into 
code data indicative of arrangement information and header information, and code data indicative of respective objects; 
1 103, a header memory for storing code data indicative of separated arrangement information and header Information; 
50 1 104 to 1 108, code memories for storing code data for respective objects; 1 109, a profile and level extractor which 
exti-acts the PLI code from the code data 1 101. and extracts information on the profile and level; and 1 110. an object 
counter which counts the number of objects included in the code data 1 101 . 

[01 36] Numeral 1111 denotes a decoder status receiver which obtains coding specifications (profile and level) of the 
decoder 1207 and other conditions; and 1 1 12, a profile and level input unit through which arbiti-ary profile and level are 
55 set from a terminal (not shown) or the like; 1 1 1 3, a profile and level determination unit which compares outputs from the 
profile and level exfractor 1 1 09 and tiie object counter 1110 with profile and level information inputted from the decoder 
status receiver 1 1 1 1 or the profile and level input unit 1 1 1 2, and determines whether or not the number of objects must 
be regulated. 
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[0137] Numeral 1114 denotes a code length comparator which determines the order of code lengths of objects by 
counting the code lengths of objects when the code data 1101 Is inputted and comparing the code lengths with each 
other; 1115, a header changer which changes the content of header information stored in the header memory 1103, 
based on the outputs from profile and level determination unit 1 1 1 3 and the code length comparator 1 1 1 4; 1 1 1 6, a mul- 
5 tiplexer which multiplexes code data read from the code memories 1 104 to 1 108 based on the output from the header 
changer 1 1 1 5 and the results of comparison by the code length comparator 1 1 1 4; and 1 1 1 7, code data outputted as a 
result of profile and level regulation. 

[Regulation of Profile and Level] 

70 

[0138] Hereinbelow, the processing in the profile and level regulator 1205 having the above construction will be 
desaibed in detail. 

[0139] The code data 1 101 is inputted into the separator 1 102. the profile and level extractor 1 109, the object counter 
1 1 1 0 and the code length comparator 1114. The separator 1 1 02 separates the code data 1 1 01 into code data indicative 

T5 of arrangement information and header information, and code data indicative of respective objects, and stores the 
respective code data into the header memory 1103 and the code memories 1104 to 1108. For example, the object 
arrangement information a, VOSSC, Visual Object SC, the respective code immediately prior to the VO data A, and the 
header information of VOL and VOP data in Fig. 25, and the like, are stored in the header memory 1 103. Further, the 
VOL and VOP data for the respective object, where the header Information is removed, are stored in the code memories 

20 1 1 04 to 1 1 08. These data are stored independently such that the header-removed part is clearly indicated. For exam- 
ple, in the image in Fig. 43. as the number of objects is five, the code data of the objects 2000 to 2004 (VO data A to E 
in Fig. 31A) are respectively stored into the code memories 1 104 to 1 108. 

[0140] At the same time, the object counter 1110 counts the number of objects included in the code data 1101. Then 
the code length comparator 1114 measures code lengths of the respective objects. 
25 [0141] The profile and level extractor 11 09 extracts PLI-a from the code data 1101 and decodes it, to extract informa- 
tion on the profile and level of the code data 1 101. At the same time of extraction, the decoder status receiver 1111 
operates, to obtain information on the profile, level and the like, decodable by the decoder 1 207. These information may 
be set by the user via the profile and level input unit 1112. 

[0142] The profile and level determination unit 1113 compares the profile and level information, obtained from the 
30 decoder 1207, or set by the user, with the result of extraction by the profile and level extractor 1 109. If the obtained or 
set profile and level are higher than or equal to those extracted from the code data 1 1 01 , the profile and level determi- 
nation unit 1 1 13 does not operate the header changer 1115. Then, the contents of the header memory 1 103 and the 
code memories 1 1 04 to 1 1 08 are read in the order of input, and multiplexed by the multiplexer 1116. Thus, code data 
1 1 1 7 is generated. That is. the contents of the code data 1 1 1 7 are the same as that of the code data 1101. 
35 [0143] On the other hand, if the profile and level obtained from the decoder 1 207 or set by the user are lower than the 
profile and level extracted from the code data 1101. the profile and level determination unit 1 1 13 inputs the number of 
objects included in the code data 1 101 from the object counter 1110. and compares the number of objects with the 
number of decodatjie objects, determined from the obtained or set profile and level information. 
[0144] If the number of objects obtained by the object counter 1 1 10 is less than the number of decodable objects, the 
40 code data 1 1 1 7 is generated, as in the case of the above-described case where the obtained or set profile and level are 
higher than or equal to those extracted from the code data 1101. 

[0145] On the other hand, if the number of objects obtained by the object counter 1 1 1 0 is greater than the number of 
decodable objects, the number of decodable objects is inputted into the code length comparator 1114. and the code 
lengths are compared with each other. The code length comparator 1114 sets objects to be decoded, from an object 

45 having the longest code length. That is. the objects are decoded sequentially from the object having the longest code 
length. For example, in Fig. 31 A, if the code length of video object becomes shorter, in the order in which the VO data 
A, the VO data D, the VO data C, the VO data E, and the VO data B appear, as the decoder 1207 performs decoding of 
Core profile and level 1 , it can decode to a maximum of four objects. Accordingly, the code length comparator 1114 dis- 
ables reading of the VO data B from the code memory 1 106, and enables reading from the code memories 1 104, 1 105, 

50 1107 and 1108. 

[0146] Then the profile and level determination unit 1113 operates the header changer 1 1 15 to change the content 
of PLI in correspondence with the decoder 1207, then, coding is performed. In this manner, header information on 
undecodaWe (deleted) object (VO data B in this case) by the decoder 1207 is deleted, based on the result of compari- 
son by the code length comparator 1114. That is, the header information of the code data 1 101 is rewritten with con- 
55 tents corresponding to the decoding capability of the decoder 1207 or the set profile and level. Further, an-angement 
information on the object 2002 corresponding to the deleted object (VO data B) is deleted from the an-angement infor- 
mation a, and new arrangement information p is generated. 

[0147] Then, the contents of the header changer 11 15 and the code memories 1104, 1105, 11 07 and 11 08 are read 
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in the order of input, and multiplexed by the multiplexer 1116, thus the code data 1117 is generated. Fig. 31 B shows a 
bit stream of the code data 1 1 1 7. In Fig. 31 B, the newly generated an-angement Information p is provided at the head 
of the bit stream, then, VOSSC, Visual Object data p-1, p-2, p-3. and VOSEC follow. The Visual Object data p-1, p-2 
and p-3 are obtained by regulating the number of objects with respect to the original Visual Object data a-1 , a-2 and a- 
5 3 in Fig. 31A. For example, the Visual Object data p-1 comprises the Visual Object SC positioned at the head, PLI-p 
Indicative of the profile and level corresponding to the decoder 1207, and code data where the code data (VO data B) 
on the object 2002 is deleted. 

[0148J The code data 1117 obtained as above Is stored into the storage device 1 206, or decoded by the decoder 1 207 
and displayed on the display unit 1208. Fig. 44 shows a displayed image, represented by the decoded code data 1117. 
10 In Fig. 44, the object 2002, representing the bird in the image as the object of encoding in Fig. 43, is deleted. 

[01491 Note that In the above description, the code length comparator 1114 directly counts the code lengths from the 
code data 1 1 01 , however, the code length comparator 1114 may count the code lengths based on the code data stored 
in the code memories 1104 to 1108. 

[0150] As described above, according to the present embodiment, even if the coding specifications (profile and/or 
75 level) of a decoder are different from those of an encoder, code data can be decoded. Further, by deleting object data 
having the shortest code length, selection of object to be deleted is facilitated, and the influence on a decoded image 
can be suppressed as much as possible. 

[01 51 ] Further, even if the number of objects decodable by the decoder 1 207 is less than the number defined by the 
coding specifications of the code data 1 101 , as the decoder status receiver 1111 obtains the number of actually deco- 
20 dable objects, similar advantages can be attained. 

[0152] In addition, even when code data having coding specifications higher than or equal to those of the decoder 
1207 is Inputted, by deleting object(s) to reduce the bit rate, decoding by the decoder 1207 can be performed. 

Sixth Embodiment 

25 

[01 53] Hereinbelow, a sixth embodiment of the present invention will be described. Note that the general construction 
of the moving image processing apparatus according to the sixth embodiment is similar to that in Fig. 29 of the above- 
described fifth embodiment, therefore, an explanation of the construction will be omitted. 

[0154] Fig. 32 is a block diagram showing the construction of the profile and level regulator 1205 according to the sixth 
30 embodiment of the present invention. In Fig. 32, elements corresponding to those in Fig. 30 have the same reference 
numerals and explanations of the elements will be omitted. In the sixth embodiment, the MPEG4 coding is employed 
as a moving image coding method, however, any other coding method is applicable as long as it encodes a plurality of 
objects within an image 

[01 55] In Fig. 32, numeral 1118 denotes a size comparator which extracts sizes of respective objects from the header 

35 .memory 1 1 03 and compares the sizes with each other 

[0156] As in the case of the fifth embodiment, the code data 1 101 is inputted into the separator 1 102, the profile and 
level extractor 1 1 09, the object counter 1 1 1 0 and the code length comparator 1114, and the respective code data are 
stored into the header memory 1 1 03 and the code memories 1 1 04 to 1 1 08. At the same time, the object counter 1110 
counts the number of objects included in the code data. 

40 [0157] The size comparator 1118 extracts an image size of each oljjecl, by extracting the respective VOL_width and 
VOL_height code in the bit stream structure in Fig. 25 and decoding the extracted codes. 

[0158] Then, as in the case of the fifth embodiment, the profile and level extractor 1 109 extracts information on the 
profile and level from the code data 1101, and at the same time, information on profile and level and the like of the 
decoder 1207 is obtained from the decoder status receiver 1 1 1 1, or the profile and level are set by the user from the 

45 profile and level input unit 1112. 

[0159] The profile and level determination unit 1113 compares the profile and level information obtained from the 
decoder 1207 or set by the user, as described above, with the result of extraction by tine profile and level extractor 1 109. 
If the obtained or set profile and level are higher than or equal to the profile and level exb-acted from the code data 1101, 
profile and level determination unit 1 1 13 does not operate the header changer 1115. Then, ttie code data 1 1 17 similar 

50 to the code data 1 101 is generated. 

[0160] On the other hand, if the profile and level obtained from the decoder 207 or set by the user are lower Uian the 
profile and level extracted from the code data 1 101, the profile and level determination unit 1 1 13 inputs the number of 
objects included in the code data 1 101 from the object counter 1110, and compares the input number with the number 
of decodable objects determined from the obtained or set profile and level. 

55 [0161] Then, if the number of objects obtained by the object counter 1110 is less than the number of decodable 
objects, the code data 1 1 1 7 is generated as in the above-described case where the obtained or set profile and level are 
higher than or equal to those of the code data 1101. 

[01 62] On the other hand, if the number of objects obtained by the object counter 1 1 1 0 is greater than the number of 



16 

Copied from 09964647 on 02/18/2005 



EP 0 954 1 81 A2 



decodable objects, the number of decodable objects is inputted into the size comparator 1118, and size comparison Is 
performed. The size comparator 1118 sets a plurality of objects of the code data 1101, sequentially from the largest 
image size, as objects to be decoded. That is, the objects are decodable, sequentially from the largest image size. For 
exanple, in Rg. 43, in the image sizes of the respective objects, the image size becomes smaller in the order in which 

5 the objects 2000, 2004, 2001 , 2003 and 2002 appear. As the decoder 1 207 performs decoding Core profile and level 
1 , It can decode to a maximum of four objects. Accordingly, in the Image In Fig. 43, except the smallest object 2002, the 
other four objects can be decoded. The size comparator 1118 disables reading of the code data of the object 2002 from 
the code memory 1 1 06, and enables reading from the code memories 1 1 04, 1 1 05, 1 1 07 and 1 1 08. 
[01 63] Then, as In the case of the fifth embodiment, the profile and level determination unit 1 1 1 3 operates the header 

10 changer 1 1 15 to change the content of PLI in correspondence with the decoder 1207, then, coding is performed. Fur- 
ther, header Information on the undecodable (deleted) object (object 2002 in this case) by the decoder 1207 is deleted, 
based on the result of comparison by the size comparator 1118. Further, arrangement Information on the deleted object 
2002 is deleted from the arrangement information a, and new arrangement information p is generated. 
[0164] Then, the contents of the header changer 11 15 and the code memories 1104, 1105, 1107 and HOSareread 

15 in the order of input, and multiplexed by the multiplexer 1116, thus the code data 1117 Is generated. Fig. 31B shows a 
bit stream of the code data 1 1 1 7 at this time. 

[01 65] The code data 1117 obtained as above is stored into the storage device 1 206, or decoded by the decoder 1 207 
and displayed, as an Image as shown In Fig. 44, on the display unit 1208. 

[0166] Note that in the above description, the size comparator 1118 extracts image sizes of objects based on the 
20 VOL_wldth and VOL_height code of the code data 1101, however, the size comparator 1 1 18 may extract the image 
sizes based on the VOP_width and VOP_height code, or based on shape (mask) information obtained by decoding 
code data Indicative of shape (mask) Information. 

[01 67] As described above, according to the sixth embodiment, even if the coding specifications of a decoder are dif- 
ferent from those of an encoder, code data can be decoded. Further, by deleting ol^ect data having the minimum image 
25 size, selection of object to be deleted Is facilitated, and the Influence on a decoded Image can be suppressed as much 
as possible. 

[0168] Note that in the fifth and sixth embodiments, only one object is deleted, however, two or more object can be 
deleted. Further, It may be an-anged such that the user directly designates object(s) to be deleted. 
[01 69] Further, it may be arranged such that the order of deletion Is set for the respective objects of image in advance 
30 by the profile and level input unit 1112. 

Seventh Embodiment 

[01 70] Hereinbelow, a seventh embodiment of the present Invention will be desaibed. Note that the general construc- 
35 tion of the moving image processing apparatus according to the seventh embodiment is similar to that in Rg. 29 of the 
fifth embodiment, therefore, an explanation of the construction will be omitted. 

[01 71 ] Fig. 33 Is a block diagram showing the detailed construction of the profile and level regulator 1 205 according 
to the seventh embodiment of the present Invention. In Fig. 33, elements corresponding to those in Fig. 30 have the 
same reference numerals and explanations of the elements will be omitted. In the seventh embodiment the H/IPEG4 
40 coding Is employed as a moving image coding method, however, any other coding method Is applicable as long as it 
encodes a plurality of objects within an image. 

[0172] In Fig. 33, numeral 1 120 denotes an object selection designator which displays a plurality of objects, and in 
which the user's designation of arbitrarily selected objects is inputted; 1 1 21 , an object selector which selects code data 
of objects to be processed, based on designation from the object selection designator 1 120, and the result of determl- 

45 nation by the profile and level determination 1113; 1122 and 1124, selectors, controlled by the object selector 1121, 
which switch their Input and output; and 1 123, an object integrator which integrates a plurality of objects. 
[0173] As in the case of the above-described fifth embodiment, the code data 1101 is inputted Into the separator 1102, 
the profile and level extractor 1 109 and the object counter 1110. The separator 1 102 separates the code data 1 101 into 
code data Indicative of arrangement Information and header information and code data Indicative of respective objects. 

50 The respective code data are stored into the header memory 1 103 and the code memories 1 104 to 1 1 08. At the same 
time, the object counter 1110 counts the number of objects included In the code data 1101. 

[0174] Then, as in the case of the fifth embodiment, the profile and level extractor 1 109 extracts information on the 
profile and level from the code data 1101. The decoder status receiver 1111 obtains information on profile and level and 
the like of the decoder 1 207. Further, the profile and level are set by the user via the profile and level Input unit 1112. 
55 [0175] The profile and level determination unit 1 1 13 compares the profile and level Information obtained from the 
decoder 1 207 or set by the user, as described above, with Vne result of extraction by the profile and level extractor 1 1 09. 
If the obtained or set profile and level are higher than or equal to the profile and level exti-acted from ttie code data 1101. 
profile and level determination unit 1 1 13 controls the object selector 1 121 to select a path directly connecting the selec- 
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tor 1 122 to the selector 1 124 such that the code data does not pass through the object integrator 1 123. The header 
changer 1 1 15 is not operated. The code data stored in the header memory 1 103 and the code memories 1 104 to 1 108 
are read out in the order of input, and multiplexed by the multiplexer 1116. Thus, the code data 1117 similar to the code 
data 1101 is generated. 

5 [0176] On the other hand, if the profile and level obtained from the decoder 1207 or set by the user are lower than the 
profile and level extracted from the code data 1 1 01 , the profile and level determination unit 1113 inputs the number of 
objects included in the code data 1101 from the object counter 1110, and compares the number of objects with the 
number of decodable objects, determined from the obtained or set profile and level information. 
[0177] If the number of objects obtained by the object counter 1 1 10 is less than the number of decodable objects, the 

10 code data 1 1 1 7 is generated, as in the case of the above-described case where the obtained or set profile and level are 
higher than or equal to those extracted from the code data 1101. 

[0178] On the other hand, if the number of objects obtained by the object counter 1 1 10 is greater than the number of 
decodable objects, the number of decodable objects is inputted into the object selector 1121. The object selector 1121 
displays statuses of the respective objects (e.g. , the image in Fig. 43), information on the respective objects, information 
15 on the number of integrated objects and the like, on the object selection designator 1 120. The user selects objects to 
be integrated in accordance with these information, and inputs an instruction on the selection into the object selection 
designator 1120. 

[0179] In the seventh embodiment, as the decoder 1207 performs decoding of Ckjre profile and level 1 , it can decode 
to a maximum of four objects. For example, as the image in Fig. 43 has five objects, two of them are integrated into one 

20 object, whereby code data decodable by the decoder 1 207 can be obtained. Hereinbelow, a case where the user des- 
ignated integration of the object 2003 and the object 2004 in the image in Fig. 43 will be described. 
[0180] When the user designates the objects to be integrated via the object selection designator 1 120, the profile and 
level determination unit 1 1 13 operates the header changer 1 1 15 to change the content of PLI in correspondence with 
the decoder 1207, generate header information on ttie new object otrtained by integration and delete header informa- 

25 tion on the objects deleted by the integration, based on ttie result of selection by the object selector 1121. More specif- 
ically, arangement Information of the new object obtained as a result of integration is generated and arrangement 
information of tiie original objects 2003 and 2004 are deleted, based on the an-angement information of ttie objects 
2003 and 2004. Then, tiie size of the object obtained by the integration or other information are generated as header 
information and header information of the original objects 2003 and 2004 are deleted, based on ttie header information 

30 of the objects 2003 and 2004. 

[0181 ] TTie object selector 1121 controls the input/output of the selectors 1 1 22 and 1 1 24 so as to perform integration 
processing by the object integrator 1 123 with respect to code data of the objects 2003 and 2004, and to avoid process- 
ing by the object integrator 1 1 23 with respect to other code data. 

[0182] Then, contents of the header changer 1 1 15 and the code memories 1 104 to 1 1 06 holding tiie code data of the 
35 objects 2000 to 2002 are read out in the order of Input, and multiplexed by tiie multiplexer 1 1 1 6 via the selectors 1 1 22 
and 1 124. On the other hand, the contents of the code memories 1 107 and 1 108 holding the code data of the objects 
2003 and 2004 to be integrated are inputted via ttie selector 1 1 22 to the object integrator 1 1 23. 

[Object Integrator] 

[0183] Fig. 34 is a block diagram showing tiie detailed construction of the object integrator 1 1 23. In Fig. 34, numerals 
1050 and 1051 denote code memories respectively for storing code data of objects to be integrated; 1052 and 1054. 
selectors which switch input/output for respective objects; 1053. an object decoder which decode code data and repro- 
duces an image of an object; 1055 and 1056. frame memories for storing reproduced images for respective objects; 
45 1 057. a synthesizer which syrthesizes objects in accordance with an-angement information of objects to be integrated 
stored in ttie header memory 1 103; and 1 058, an object encoder which encodes image data obtained by synthesizing 
and outputs the image data. 

[0184] Hereinbelow. the operation of the object integrator 1 1 23 will be described in detail. The code data of Uie objects 
2003 and 2004 to be integrated are stored into the code memories 1050 and 1051. First, the selector 1052 selects an 

50 input on the code memory 1 050 side, and the selector 1054, an output on the frame memory 1 055 side. Thereafter, the 
code data is read out from tiie code memory 1050. and decoded by ttie object decoder 1053. Then image information 
of the object 2003 is written via the selector 1054 into ttie frame memory 1055. The image information of the object 
comprises image data indicative of a color inrage and mask information indicative of a shape. Then, the input and out- 
put of the selector 1052 and 1054 are switched to the opposite sides, and similar processing is performed, whereby ttie 

55 image information of the object 2004 is stored into ttie frame memory 1 056. 

[0185] The synthesizer 1 057 obtains location information and size information of the objects 2003 and 2004 from ttie 
header memory 1 103, and obtains ttie image size of the new object obtained by object synthesizing and relative loca- 
tions of the original objects 2003 and 2004 in ttie new object. Then, ttie image information in the frame memories 1055 
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and 1056 are read out, and the color image information and the masl< information are respectively synthesized. Fig. 37 
is shows the result of synthesizing of color image information. Fig. 38 shows the result of synthesizing of mask Informa- 
tion. The object encoder 1058 encodes these color image information and mask information in accordance with the 
MPEG4 object coding. Then, the object integrator 1 1 23 outputs the encoded information. 

5 [0186] The code data outputted from the object integrator 1 123 is multiplexed with other code data by the multiplexer 
1 1 1 6 via the selector 1 1 24, thus the code data 11 1 7 is obtained. Fig. 35 shows a bit stream of the code data 1 1 1 7. Fig. 
35 shows the result of integration processing according to the seventh embodiment with respect to the code data 1 101 
in Fig. 31 A. In Fig. 35, the bit stream has arrangement information y including arangement information of the newly 
obtained object as the result of synthesizing. VOSSC, Visual Object data y-1, y-2 and y-3, and VOSEC. The Visual 

10 Object data y-1 , y-2 and y-3 are obtained by object integration regulation with respect to the original Visual Object data 
a-1 , a-2 and a-3 shown in Fig. 31 A. For example, the Visual Object data y-1 , following Visual Object SO, comprises PLI- 
y indicative of profile and level appropriate to the decoder 1207, VO data A, VO data B and VO data C as respective 
code data of the objects 2000 to 2002, and code data VO data Q obtained by integrating the objects 2003 and 2004. 
[0187] The code data 1117 obtained as above is stored into the storage device 1 206. or decoded by the decoder 1 207 

IS and reproduced as an image as shown in Fig. 43 and displayed on the display unit 1208. 

[0188] Note that in the seventh embodiment, the user selects and designates objects to be integrated within an image 
by the object selection designator 1 120, however, the present invention is not limited to this example. For example, it 
may be arranged such that the integration order is set for objects of the image in advance by the object selection des- 
ignator 1 120, then if the number of objects decodable by the decoder 1207 is less than the number of objects of the 

20 Image and object integration is required, object integration is automatically performed in accordance with the set order. 
[0189] As desCTibed above, according to the seventh embodiment even if profile and/or level of a decoder are differ- 
ent from those of an encoder, code data can be decoded. Further, by integrating objects and decoding the integrated 
object, loss of decoded object can be prevented. 

[01 90] Further, the object integration processing can be performed in incremental order of code length or image size 
25 by providing the code length comparator 1114 and the size comparator 1118 shown in the fifth and sixth embodiments 
in place of the object selection designator 1 120 and the object selector 1 121 for controlling the object integrator 1 123. 
[0191] Fig. 36 is a block diagram showing the construction of the object integrator 1 123 according to a modification 
of the seventh embodiment. In Rg. 36. elements corresponding to those in Fig. 34 have the same reference numerals 
and explanations of the elements will be omitted. The construction of Fig. 36 is characterized by further comprising a 
30 code length counter 1059. The code length counter 1059 counts code lengths of code data of respective objects prior 
to integration, and parameters (e.g.. quantization parameters or the like) of the object encoder 1058 is controlled such 
that the code length of output from the object encoder 1058 is the same as the counted result. Thus the objects can be 
synthesized without inaeasing the total code length. 

35 Eighth Embodiment 

[0192] Hereinbelow, an eighth embodiment of the present invention will be described. As in the case of the above- 
described seventh embodiment, object integration processing is performed in the eighth embodiment. Note that the 
general construction of the moving image processing apparatus of the eighth embodiment, and the detailed conslruc- 
40 tion of the profile and level regulator 1 205 are the same as those in Fig. 33. therefore explanations of the apparatus and 
the construction of the prdile and level regulator will be omitted. 

[0193] Fig. 39 is a block diagram showing the detailed construction of the object integrator 1 123 according to the 
eighth embodiment of the present invention. In Fig. 39. elements corresponding to those in Fig. 34 have the same ref- 
erence numerals and explanations of the elements will be omitted. 

45 [0194] In Fig. 39. numerals 1 060 and 1061 denote separators which separate input code data into code data on mask 
information indicative of shape and code data indicative of color image information and output the separated data. 
Numeral 1062 to 1065 denote code memories. The code data indicative of color image information is stored into the 
code memories 1062 and 1064, and the code data on mask information is stored into the code memories 1063 and 
1065, for respective objects. Numeral 1066 denotes a color image information code synthesizer which synthesizes the 

50 code data indicative of color image information in the form of code data; 1067. a mask information code synthesizer 
which synthesizes the code data indicative of mask information in the form of code data; 1 068. a multiplexer which mul- 
tiplexes code outputted from the color image information code synthesizer 1066 and the mask information code synthe- 
sizer 1067. 

[0195] Hereinbelow. the operation of object integrator 1 1 23 according to the eighth embodiment will be described in 
55 detail. As in the case of the seventh embodiment, the code data of the objects 2003 and 2004 are stored respectively 
into the code memories 1050 and 1051. The code data of the object 2003 stored in the code memory 1050 is read out 
in frame units (VOP units), separated by the separator 1060 into code data of color image information and code data of 
mask information, and the respective code data are stored into the code memories 1062 and 1 063. Similarly, code data 
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of color image information and code data of mask information of the object 2004 are stored into the code memories 
1064 and 1065. 

[0196] Thereafter, the color image information code synthesizer 1066 reads the color Image information code data 
from the code memories 1062 and 1064. Further, as in the case of the seventh embodiment, the color image informa- 

5 tion code synthesizer 1066 obtains location information and size information of the objects 2003 and 2004 from the 
header memory 11 03, and obtains the image size of synthesized new object and respective relative locations of the 
original objects 2003 and 2004 in the new object. That is, the color image information code synthesizer 1066 performs 
synthesizing on the assumption that if these color image information code data are synthesized and decoded, an image 
as shown in Fig. 37 can be obtained as one object. 

10 [01 97] Note that the MPEG4 coding method has a slice data structure to define a plurality of macro blocks as a cluster 
of blocks in a main scanning direction. Fig. 40 shows an example of the slice structure applied to the objects in Fig. 37. 
In Fig. 40, an area in a bold frame is defined as one slice. In each slice, the head macro block is hatched. 
[0198] The color image information code synthesizer 1066 performs reading in a rightward direction (main scanning 
direction) as shown in Fig. 40, sequentially from an upper left macro block data of the image, to be obtained as a result 

15 of synthesizing. That is, among the code data of the object 2003, code data corresponding to the head macro block of 
the head slice is read from the code memory 1062 first. The header information of the slice is added to the read code 
data, and the code data of the head macro block is outputted. Then, the code data corresponding to the macro block 
on the right of the head macro block is read and outputted. In this manner, the read and output operations are sequen- 
tially repeated to the slice. 

20 [01 99] Note that a portion where data has been newly generated between the objects 2003 and 2004 is considered 
as a new slice. As this portion is not displayed even if decoded with mask information, appropriate pixels are provided 
to cover the portion. TTiat is, such portion comprises only DC component of the last macro block including an object. As 
the DC difference is "0", and all the AC coefficients are "0", no code is generated. 

[0200] Then, as it is considered that a new slice has started on the edge of the object 2004, a hatched macro block 
25 in Fig. 40 is regarded as the head of new slice, and the header information of the slice is added to the block. In this case, 
as the address of the head macro block is an absolute address, the address is converted to a relative address from the 
macro block including the previous object. Note that in the macro block, if DC component or the like is predicted by refer- 
ring to another macro block, that portion is re-encoded, then code data of the macro block is sequentially outputted in 
the rightward direction. That is, the slice header is added on the edge of object, and the prediction of the slice head 
30 macro block is replaced with initialized code. The obtained code is outputted to the multiplexer 1 068. 

[0201 ] In parallel to the operation of the color image information code synthesizer 1 066, the mask information code 
synthesizer 1067 reads the code data of the mask information from the code memories 1063 and 1065. Then, the mask 
information code synthesizer 1067 obtains location information and size information of the objects 2003 and 2004 from 
the header memory 1 1 03, and obtains the image size of a synthesized new object and relative locations of the original 
35 objects 2003 and 2004 in the new object. Then, by decoding and synthesizing the input code data of the mask informa- 
tion, the mask information code synthesizer 1067 obtains mask information as shown in Fig. 38. The mask information 
code synthesizer 1067 encodes the mask image by an arithmetic encoding as the MPEG4 shape information coding 
method. The obtained code is outputted to the multiplexer 1068. 

[0202] Note that the mask information coding is not limited to the MPEG4 arithmetic coding method. For example, in 
40 the result of synthesizing of mask information code data, as the zero-run between object edges is merely lengthened, 

the synthesizing can be made only by replacing code representing the zero-run length without decoding by the mask 

information code synthesizer 1067, by employing zero-run coding or the like, used in a facsimile apparatijs. Generally, 

even when mask information is encoded by the arithmetic or another coding, the code length is merely slightly changed. 

[0203] The multiplexer 1 068 multiplexes the code data on the synttiesized color image information and the code data 
45 of the mask information, as code data of one object. The subsequent processing is similar to that in the above- 

desCTibed seventh embodiment. The multiplexer 1116 multiplexes ttie code data witii otiier code data and outputs tine 

data. 

[0204] As described above, according to Uie eighth embodiment, even in a case where the profile and/or level of an 
encoder are different from those of a decoder, code data can be decoded. Further, as objects are integrated in the form 
50 of code data, loss of object in decoded image data can be prevented only by adding header information. 

[0205] Further, in the object integration processing according to ttie eighth embodiment, a newly added header can 
be obtained by a slight amount of calculation, and further, code change is limited to the head block of a slice. Accord- 
ingly, the object integration processing can be performed at a speed higher than that in the object integration processing 
by decoding and re-encoding according to the seventh embodiment. 

55 

Ninth Embodiment 

[0206] Hereinbelow, a nimh embodiment of the present invention will be described. In ttie ninth embodiment, object 
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integration prcx:essing is performed, as in the case of the above-described seventh embodiment,. Note that the general 
construction of the moving image processing apparatus according to the ninth embodiment is similar to that in Fig. 29 
of the fifth embodiment, therefore, an explanation of the construction will be omitted. 

[0207] Fig. 41 is a block diagram showing the detailed construction of the profile and level regulator 1 205 according 
5 to the ninth embodiment of the present Invention. In Fig. 41 . elements corresponding to those of the seventh embodi- 
ment in Fig. 33 have the same reference numerals, and explanations of the elements will be omitted. In the ninth 
embodiment, the MPEG4 coding is employed as a moving image coding method, however, any other coding method is 
applicable as long as it encodes a plurality of objects within an image. 

[0208] In Fig. 41, numeral 1170 denotes an object arrangement information determination unit which determines 
10 objects to be integrated. 

[0209] As in the case of the swenth embodiment, the profile and level determination unit 1113 compares the profile 
and level information of the decoder 1207 with those of the code data 1 101 . Even if the profile and level of the decoder 
1 207 are higher than or equal to. or lower than those of the code data 1 1 0 1 . the code data 1 1 1 7 is generated in a similar 
manner to that of the seventh emt)odiment as long as the number of objects obtained by the object counter 1 1 1 0 is the 

15 number decodable by the decoder 1 207. 

[021 0] On the other hand, if the number of objects obtained by the object counter 1 1 1 0 is greater than the nuntoer 
decodable by the decoder 1207. the decodable number of objects is inputted into the object arrangement information 
determination unit 1 170. As in the case of the seventh embodiment, the maximum number of objects decodable by the 
decoder 1207 is four. Accordingly, in an image having five objects as shown in Fig. 43, decodable code data can he 

20 obtained by integrating two objects. 

[021 1 ] The object arrangement information determination unit 1 1 70 extracts location information and size information 
of the respective objects from the header memory 1 103. and determines two objects to be integrated based on the fol- 
lowing conditions. Note that condition (1) is given higher priority to condition (2). 

25 (1) One object is Included in the other object 

(2) The distance between both objects is the shortest 

[021 2] In the image shown in Fig. 43. the objects 2001 to 2004 are included in the object 2000. Accordingly, the object 
arrangement information determination unrt 1 1 70 determines the object 2000 and the object 2001 as objects to be inte- 
30 grated. 

[021 3] When the objects to be integrated have been determined, the profile and level determination unit 1113 oper- 
ates the header changer 1 1 15 to change and encode the content of the PLI in accordance with ttie decoder 1207, and 
generate header information on a new object obtained by object integration and delete the header information on the 
integrated objects, as in the case of the seventti embodiment, based on the result of determination by the object 

35 arrangement information determination unit 1170. More specifically, arrangement Information on the n«w object 
obtained by object integration is generated, based on ttie arrangement information of the olDjects 2000 and 2001 , and 
arrangement information of the original objects 2000 and 2001 are deleted. Then, ttie image size information or other 
information of the object obtained by ttie integration is generated as header information, based on the header informa- 
tion of ttie objects 2000 and 2001 , and ttie header information of ttie original objects 2000 and 2001 are deleted. 

40 [021 4] The object an-angement information determination unit 1 1 70 controls input/output of ttie selectors 1 1 22 and 
1 124 so as to perform integration processing on the code data of the objects 2000 and 2001 by Uie object integrator 
1 123, on the other hand, so as not to pass the ottier code data ttirough the object integrator 11 23. 
[021 5] Then, the contents of ttie header changer 1 1 1 5 and ttie code memories 1 1 06 to 1 108 holding the code data 
of ttie objects 2002 to 2004 are read out sequentially in ttie order of input, and inputted via ttie selectors 1 1 22 and 1 1 24 

45 into ttie multiplexer 1 1 16. On ttie ottier hand, the contents of ttie code memories 1 104 and 1 105 holding the code data 
of the objects 2000 and 2001 to be integrated are integrated by the object integrator 1 123. and inputted into ttie multi- 
plexer 1 1 1 6. The multiplexer 1116 multiplexes these code data, thus generates ttie code data 1117. Note that the inte- 
gration processing by the object integrator 1 123 is realized in a similar manner to ttiat in the above-described seventti 
embodiment or eightti embodiment. 

50 [021 6] Fig. 42 shows a bit stream of the code data 1117 according to ttie nintti embodiment. Fig. 42 shows the result 
of integration processing of ttie ninth embodiment performed on ttie code data 1 101 as shown in Fig. 31 A. In Fig. 42, 
arrangement information 5 including arrangement information of ttie newly obtained object is provided at tiie head. 
Then VOSSC, Visual Object data 6-1 . 8-2. 8-3, and VOSEC follow. The Visual Object data 8-1 , 5-2, 6-3 are obtained by 
performing object integration regulation on ttie original Visual Object data a-1. a-2. and a-3 in Fig. 31 A. For example, 

55 the Visual Object data 8-1 comprises Visual Object SC, ttien PLI-5 indicative of profile and level appropriate to ttie 
decoder 1207. VO data H as code data obtained by integrating the objects 2000 and 2001 , and VO data C. VO data D 
and VO data E as code data of tiie objects 2002 to 2004. 

[021 7] The code data 1117 obtained as above is stored into ttie storage device 1 206. or decoded by the decoder 1 207 
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and reproduced as an image as shown in Fig. 43, and displayed on the display unit 1208. 

[021 8] Note that in the ninth embodiment, as in the cases of the fifth and sixth embodiments, code lengths of respec- 
tive objects, object sizes and the like may be added to the conditions for determining objects to be integrated. 
[0219] As described above, according to the ninth embodiment, even if the profile and/or level of an encoder are dif- 
5 ferent from those of decoder, code data can be decoded. Further, loss of decoded object can be prevented while sup- 
pressing the amount of code changed by integration, by integrating the objects based on the location relation among 
the objects. 

[0220] In the ninth embodiment, objects to be integrated are determined based on the location relation among the 
objects. The determination according to the ninth embodiment may be employed in the above-described fifth and sixth 
10 embodiments. That is, objects to be deleted can be selected based on location information of objects. 

[0221] Note that in the seventh to ninth embodiments, two objects are integrated and one object is generated. How- 
ever, three or more objects, or two or more sets of objects may be integrated. 

[0222] Note that the arrangement of the code memories 1 104 to 1 108 and the header memory 1 103 is not limited to 
that shown in Fig. 41 . More code memories can be provided, or one memory may be divided into a plurality of areas. 
75 Further, a storage medium such as a magnetic disk may be employed. 

[0223] Further, the selection of objects to be deleted or integrated may be determined based on the combination of a 
plurality of conditions such as sizes and code lengths of objects, location relation among the objects and user's instruc- 
tion. 

[0224] Further, in a case where the fifth to ninth embodiments are applied to an image editing apparatus, even if the 
20 number of objects changes due to editing processing, the output from the apparatus can be adjusted to an arbitrary pro- 
file and/or level. 

[0225] As described above, according to the fifth to ninth embodiments, code data encoded for a plurality of image 
information (objects) can be decoded by decoders of arbitrary specifications. Further, the number of objects included in 
the code data can be regulated. 

25 

Tenth Embodiment 
[Construction] 

30 [0226] Fig. 45 is a block diagram showing the construction of the moving image processing apparatus according to a 
tenth embodiment of the present invention. In the tenth embodiment, the MPEG4 coding is employed as a moving 
image coding method. Note that the coding method is not limited to the MPEG4 coding, but any other coding method 
is applicable as long as it encodes a plurality of objects within an image. 

[0227] In Fig. 45. numerals 2201 and 2202 denote storage devices holding moving image code data. The storage 
35 devices 2201 and 2202 respectively comprise a magnetic disK an magneto-optical disk, a magnetic tape, a semicon- 
ductor memory or the like. Numeral 2203 denotes a TV camera which obtains a moving image and outputs a digital 
image signal; 2204, an encoder which performs coding by the MPEG4 coding method; 2205, a communication line of 
a local area network (LAN), a public line, a broadcasting line or the like; 2206, a communication interface which receives 
coded data from the communication line 2205; and 2207, an editing operation unit which displays image editing condi- 
40 tion. The user inputs editing instruction from the editing operation unit 2207. Further, numeral 2208 denotes an image 
editing unit characteristic of the present embodiment; 2209, a storage device for storing output from the image editing 
unit 2208; 2210, a decoder which decodes code data of a moving image encoded by the MPEG4 coding; 221 1, a dis- 
play unit which displays a moving image decoded by the decoder 2210. 

45 [Image Editing] 

[0228] Hereinbelow, image editing processing of the present emtxxjiment will be described using a specific image as 
an example. 

[0229] Image data, encoded by the MPEG4 coding of Core profile and level 2 at a bit rate of 384 ktips, is stored into 
50 the storage device 2201 . Fig. 46A shows an example of the image stored in the storage device 2201 . Fig. 50A shows 
the code data of the image. In the image of Fig. 46A. a background object 2300 includes objects 2304 and 2305 repre- 
senting men. In Fig. 50A, code data of the background object 2300 is VO data A-1 -1 , and code data of the men objects 
2304 and 2305 are VO data A-1 -2 and VO data A-1 -3. 

[0230] Image data, encoded by the MPEG4 coding of Core profile and level 1 at a bit rate of 200 kbps, is stored into 
55 the storage device 2202. Fig. 46B shows an example of the image stored in the storage device 2202. Fig. SOB shows 
the code data of the image. In the image of Fig. 468. a background object 2301 includes objects 2306 and 2307 repre- 
senting a man and a woman. In Fig. 508, code data of the background object 2301 is VO data 8-1 -1 , and code data of 
the man and woman objects 2306 and 2307 are VO data B-1 -2 and VO data 8-1-3. 
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[0231] In a case where the TV camera 2203 obtains an image as shown in Fig. 46C and the encoder 2204 encodes 
the image data by the MPEG4 coding of Simple profile and level 1 at a bit rate of 32 kbps, as a new object Is not 
extracted from the obtained image, the entire image is handled as one object 2302. Accordingly, as shown in Fig. 50C, 
code data of the image comprises VO data C-1 -1 , code data of one object 2302. 

5 [0232] Further, in a case where an image as shown in Fig. 46D is encoded by the MPEG4 coding of Simple profile 
and level 2 and inputted from the communication line 2205 via the communication interface 2206, a background object 
2303 in the image In Fig. 46D includes objects 2308 and 2309 representing a woman and a man. Fig. SOD shows code 
data of the image, in which code data of the background object 2303 is VO data D-1-1, code data of the man and 
woman objects 2308 and 2309 are VO data D-1 -2 and VO data D-1 -3. 

10 [0233] Note that to simplify the explanation, the sizes of all the above-described images (Figs. 46A to 46D) are defined 
with QCIF (Quarter (Common Intermediate Format). 

[Image Editing Unit] 

T5 [0234] All the code data are inputted Into the image editing unit 2208. Fig. 47 is a block diagram showing the con- 
struction of the image editing unit 2208. In Fig. 47. numerals 2101 to 2104 denote system code memories for storing 
system-related code data for respective inputs; 21 05 to 21 08. video code memories for storing moving image code data 
for respective inputs; 2109. a video decoder which decodes moving image code data to reproduce objects; and 21 10. 
a system decoder which decodes the system code data to reproduce object arrangement information and the like. 

20 [0235] The results of decoding are outputted to the editing operation unit 2207, and the respective objects are dis- 
played in accordance with the an-angement information. The editing operation unit 2207 newly sets display timing, 
speed and the like, in accordance with designation of arrangement of these objects, size change, deformation and the 
like, instructed by the user. 

[0236] Numeral 2111 denotes a system code synthesizer which synthesizes system code; 21 1 2. a header processor 
25 which synthesizes or changes headers of video code; 21 1 3. a selector which arbitrarily selects one of outputs from the 
video code memories 2105 to 2108 and outputs the selected output; 21 14, a multiplexer which multiplexes outputs from 
the system code synthesizer 21 1 1 , the header processor 21 12 and the selector 21 13 to generate code data. 
[0237] In the image editing unit 2208, respective outputs from the storage devices 2201 and 2202, the encoder 2204 
and the communication interface 2206 are separated into system code data and moving image code data. The system 
30 code data are stored into the system code memories 2101 to 2104, and the moving image code data are stored into 
the video code memories 2105 to 2108. 

[0238] When the respective code data have been stored, the video decoder 2109 and the system decoder 21 10 
decode the respective data, and output the decoded data to the editing operation unit 2207. In the editing operation unit 
2207, the user sets settings of deletion/holding objects, change of arrangement, moving image start timing, frame rate 
35 and the like. The video decoder 2209 and the system decoder 2110 arbitrarily perform decoding in accordance with the 
editing operation. 

[0239] Fig. 48 shows an example of an image synthesized from the images shown in Figs. 46A to 46D. That is, a new 
Image 2320 is generated by editing and synthesizing the four images. The size of the image 2320 is defined with GIF 
format because the QCIF four images are synthesized without overlapping with each other. In the image 2320, the 

40 background object 2300, the object 2302. the background objects 2303 and 2301 are arranged, from an upper left posi- 
tion in a clockwise manner. Further, the men objects 2304 and 2305 are moved horizontally in rightward direction 
(edited). The object 2308 is enlarged and moved onto the background object 2300 (edited). 
[0240] The system code synthesizer 2111 reads out the system code data from the system code memories in accord- 
ance with the results of synthesizing, then generates new system code data with an-angement information conespond- 

45 Ing to these deformation and movement, and outputs the new system code data to the multiplexer 2114. 

[0241 ] Next, the changing condition accompanying synthesizing of respective objects will be described below. 
[0242] First, regarding the background object 2300, coordinates, start timing and the like have not been changed. 
Regarding the background object 2301 , its coordinates (0,0) has been changed to (0,144). Regarding the object 2302, 
its coordinates (0,0) has been changed to (176,0). Regarding the background object 2303, its coordinates (0,0) has 

so been changed to (176.144). 

[0243] Regarding the men objects 2304 and 2305. coordinate values for the rightward movement have been added 
to their coordinates. Regarding the objects 2306 and 2307. the coordinates have been changed in correspondence with 
the change of the coordinates of the background object 2301 from (0.0) to (0.144). so as to move the absolute positions 
downward by "144". Regarding the object 2308. new coordinates have been generated based on the expansion desig- 

55 nation (magnification ratio) and a new distance from the origin (0,0). Regarding the object 2309. its coordinates have 
been changed in correspondence with the change of the coordinates of the background object 2303 from (0.0) to 
(176,144), so as to move the absolute position rightward by "176" and downward by "144". 

[0244] Note that in horizontal movement of object, the system code synthesizer 2111 merely adds the amount of 
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movement to the coordinates of display position with respect to the code data of the object, however, in expansion or 
deformation processing, generates commands corresponding to those processing and newly performs coding. Note 
that system code in the MPEG4 standard is similar to the CG language VRML, therefore, detailed commands are 
approximately similar to those in the VRML or ISO/IEC14496-1 . 

5 [0245] On the other hand, the header processor 21 12 generates a new header in correspondence with the results of 
editing of the system code data. Fig. 49 is a block diagram shewing the detailed construction of the header processor 
2112. In Fig. 49, numeral 2120 denotes a separator which separates input header information for respective codes and 
determine output destinations; 2121, a profile determination unit; 2122, an object number determination unit; 2123, a 
bit rale determination unit; and 2124, a profile determination unit which determines a profile. 

10 [0246] In the header processor 21 12, the separator 21 20 extracts PLI code, VOSC and bitrate code from header infor- 
mation of the respective objects, from video code memories 2105 to 2108, and inputs the extracted code into the profile 
determination unit 2121 , the object number determination unit 2122 and the bit rate determination unit 21 23. The profile 
determination unit 2121 decodes the PLI code and detects the highest profile and level from profiles and levels of 
images to be synthesized. The object number determination unit 2122 counts the number of objects included in the 

75 code data by counting the VOSC. The bit rate determination unit 21 23 detects the respective bit rates by decoding the 
bitrate code, and obtains the total sum of the bit rates. The outputs from the respective determination units are inputted 
into the profile determination unit 2124. 

[0247] The profile determination unit 2124 determines profile and level satisfying the highest profile, the number of 
objects and bit rate, by referring to the profile table as shown in Fig. 28. In the present embodiment, the highest profile 
20 of the four images to be synthesized is Core profile and level 2, the number of objects of the synthesized images is 1 0, 
and the total sum of the bit rates is 684 kbps. Accordingly, the profile and level satisfying these conditions is, according 
to the profile table. Main profile and level 3. The profile determination unit 2124 generates new PLI code based on Main 
profile and level 3, and outputs the PLI code. 

[0248] The multiplexer 2114 multiplexes the system code data generated by the system code synthesizer 21 1 1 and 
25 the code data of moving image. The moving image code data is reproduced by reading the code, where profile-related 
code or the like is corrected, from the header processor 2112, or arbitrarily reading the code data stored in the video 
code memories 2105 to 2108, and multiplexing the read data. Then, the multiplexed code data is outputted to the stor- 
age device 2209 and the decoder 221 0. 

30 [Processing Procedure] 

[0249] Fig. 50E shows code data obtained as a result of multiplexing by tiie multiplexer 21 1 4. It is understood from 
Fig. 50E that all the code data shown in Figs. 50A to SOD are synthesized, i.e., all the objects in Figs. 46A to 46D are 
included. Note that in the multiplexed code data, user data may be positioned prior to the code data of the respective 

35 objects, or intensively positioned in a predetermined position within the code data. 

[0250] Fig. 51 is a flowchart showing image processing according to the present emlxxliment. When the apparatus 
has been started, code data of images are inputted from the respective image input means (storage devices 2201 and 
2202, encoder 2204 and communication interface 2206), and stored into tiie code memories 2101 to 2104 and 2105 to 
2108 (step S101). Then, the code data are respectively decoded, and images represented by tiie decoded data are pre- 

40 sented to the user (step S102). Thereafter, the results of the user's edition at the editing operation unit 2207 is obtained 
(step S103), and tiie system code is changed (step S104) in accordance with the obtained results of editing. Further, 
the header of moving image code data are changed in accordance witti the profile and level, the number of objects, tiie 
bit rate and the like, so as to generate new code (step S105). Then, in tiie multiplexer 2114, the system code data and 
video code data are multiplexed and outputted (step SI 06). 

45 [0251] As code data synthesized by the image editing unit 2208 is inputted into the decoder 2210, the decoder 2210 
easily detects the scale of input code data to be decoded, the number of necessary decoders and the like. Accordingly, 
it can be easily determined whether or not decoding is possible wittiout actually decoding the code data. For example, 
even if it is determined that decoding is impossible, ttie code data can be temporarily stored into the storage device 
2209 and decoded when a necessary number of decoders are provided. 

50 [0252] Note that tiie arrangement of tiie system code memories 2101 to 2104 and tiie video code memories 2105 to 
21 08 of the present embodiment is not limited to tiiat in Fig. 47, however, more code memories may be provided, or one 
memory may be divided into a plurality of areas. Further, a storage medium such as a magnetic disk may be employed. 
[0253] According to the present embodiment, when code data of different profiles and/or levels are synthesized, pro- 
file and level are re-defined. Since the scale of code data to be inputted, a necessary number of decoders and tiie like 

55 are obtained in advance, in the decoder 2210, it can he easily determined whether or not decoding is possible. 
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Eleventh Embodiment 

[0254] Hereinbelow, an eleventh embodiment of the present invention will he described. Note that the general con- 
sti-uction of the moving image processing apparatus of the eleventh embodiment is similar to that of the above- 
5 described tenth embodiment in Fig. 45, therefore, an explanation of the construction vtrill be omitted. In the eleventh 
embodiment, the user designates an arbitrary profile using the editing operation unit 2207, and the image editing unit 
2208 generates code data based on ttie designated profile. 

[Construction] 

10 

[0255] Fig. 52 is a block diagram showing the detailed construction of the image editing unit 2208 according to the 
eleventh embodiment. In Fig. 52, elements corresponding to those in Fig. 47 of the tenth embodiment have the same 
reference numerals, and explanations of the elements will be omitted. In the eleventh embodiment, the MPEG4 coding 
method is employed as a moving image coding mettiod, however, any other coding method is applicable as long as it 

75 encodes a plurality of objects within an image. 

[0256] Numeral 2130 denotes a profile conf oiler which performs various controls to synthesize input plural image 
data in correspondence with a profile designated from the editing operation unit 2207, 21 31 , a system code synthesizer 
which synttiesises system code; 2132, a header processor which synthesizes and changes header of video code; 
2134, a code length regulator which regulates code lengths of respective objects; 2136, an integration processor which 

20 performs integration processing on objects; and 2133, 2135 and 2137, selectors which switch respective input/output 
in accordance with an instruction from the profile controller 2130. 

[0257] As in the case of the above-described tenth embodiment, the code data inputted from the storage devices 
2201 , 2202, ttie encoder 2204 and the communication interface 2206 are separated into system code data and moving 
image code data, and stored into the system code memories 2101 to 2104 and the video code memories 2105 to 2108. 

25 [0258] Note that in the eleventh embodiment, the code data inputted from the storage devices 2201 and 2202, the 
encoder 2204 and the communication interface 2206 are tiie same as those in the above-described tenth embodiment. 
Accordingly, the respective images are the same as those in Figs. 46A to 46D, and the code data in Figs. 50A to 50D 
are obtained by encoding the respective images. Note that in the eleventh embodiment, code data (VO data A) of Core 
profile and level 2 and at a bit rate of 1024 ktips is inputted from tiie storage device 2201 . Similarly, code data (VO data 

30 B) of Core profile and level 1 and at a bit rate of 384 kbps is inputted from the storage device 2202. Similarly, code data 
(VO data C) of Simple profile and level 3 and at a bit rate of 384 kbps is inputted from the encoder 2204, and code data 
(VO data D) of Core profile and level 2 and at a bit rate of 768 kbps is inputted from the communication interface 2206. 
[0259] In this embodiment, these code data have information unique to the respective objects as user data. The 
objects in the eleventh embodiment are "people", "background" and "non-cut-out screen image". As user data of a 

35 "rrian" object, information indicating tiiat the type of objects is "man", personal information of the man (sex, age, profes- 
sion and the like), furtiier, action of the man in ttie image (e.g., ttie men objects 2304 and 2305 are discussing, ttie man 
object 2307 is giving an injection to the girl object 2306). These object-unique information are utilized upon editing oper- 
ation such as object search. 

[0260] When the respective code data have been stored into the code memories, the video decoder 2109 and ttie 
40 system decoder 21 10 respectively decode the code data and output ttie decoded data to ttie editing operation unit 
2207. At ttie editing operation unit 2207, ttie user operates settings such as selection of deletion/holding objects, 
change of arrangement, moving image start timing and frame rate, ttius, ttie synttiesized image 2320 as shown in Fig. 
48 is obtained, as in the case of the tentti embodiment. 

45 [Setting of Profile and Level] 

[0261 ] As described above, in the eleventti embodiment, ttie user can arbiti-arily set the profile and level of code data 
to be oulputted, from ttie editing operation unit 2207. Accordingly, when the generated code data is delivered by broad- 
casting or ttie like, ttie user can adjust the profile and level of the code data to ttiose of a decoder to receive the code 
50 data. Hereinbelow, a case where ttie user has designated Core profile and level 2 at ttie editing operation unit 2207 will 
be described. 

[0262] The user's designation of profile and level is inputted, witii the results of editing, into ttie profile controller 21 30. 
The synthesized inrage 2320 shown in Fig. 48 includes 10 objects, and the total sum of ttie bit rates is 2560 kbps. Fur- 
ther, in Core profile and level 2 designated by the user, ttie maximum number of objects is 8, and the maximum bit rate 
55 is 2048 kbps, according to the profile table in Fig. 28. To perform decoding of the designated profile at the designated 
level, ttie number of objects of the synttiesized image must be reduced by two. and the bit rate must be controlled to 
2048 kbps. 

[0263] The profile controller 2130 reduces code lengtti of code data based on the following conditions in ttie numerical 
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priority order. 

(1) Code length is reduced from the highest profile level 

(2) Code length is reduced from a highest bit rate 
5 (3) All the code lengths are reduced 

[0264] Hereinbelow, the bit rate of the VO data A is reduced from 1 024 kbps to 51 2 kbps by reducing the code length 
of the VO data A based on these conditions. 

[0265] Further, to reduce the number of objects, two objects may be synthesized into one object, for example. In the 
10 eleventh embodiment, objects to he Integrated are determined from a plurality of objects, by referring to node informa- 
tion in the system code stored in the system code memories 2101 to 2104. That is. parent-child relation of nodes are 
referred to, and objects having the same parent are integrated. 

[0266] Hereinbelow. the object integration processing according to the eleventh embodiment will be descritied. 
[0267] Figs. 53A to 53D show node statuses of the respective objects in the eleventh embodiment. Fig. 53A shows a 

IS node relation of the image in Fig. 46A. The code data is divided into the background 2300 and People node represent- 
ing people, further, the People node is a parent of the "men" objects 2304 and 2305. Similarly, Fig. 53B shows a node 
relation of the image in Fig. 46B; Fig. 53C. a node relation of the image in Fig. 46C: and Fig. 53D, a node relation of the 
image in Fig. 46D. That is. in Fig. 53A, the "men" objects 2304 and 2305 are connected to the People node; in Fig. 53B, 
the "girl" object 2306 and the "doctor" object 2307 are connected to the People node; in Rg. 53D. the "woman" object 

20 2308 and the "man" object 2309 are connected to a dancer node. 

[0268] Accordingly, in the eleventh embodiment, the objects connected to the People and dancer nodes indicative of 
people are determined as objects to be integrated for respective images. That is. in the image in Fig. 46A. the objects 
2304 and 2305 are integrated. Similarly, in the image in Fig. 46B, the objects 2306 and 2307 are integrated; in the 
image in Fig. 48D. the objects 2308 and 2309 are integrated. By this integration, the number of objects in the synthe- 

25 sized image becomes seven, and the number of objects satisfies Core profile and level 2. 

[0269] The profile controller 21 30 instructs the system code synthesizer 21 31 to newly reproduce the arrangement 
information of the respective oljjects after the object integration. The system code synthesizer 2131 generates system 
code data in the state where the objects are integrated, as in the case of the tenth embodiment. 
[0270] At the same time, the profile controller 2130 instructs the header processor 2132 to newly generate header 

30 information of the respective objects after the object integration. That is. the size of image is changed to CIF(352x288). 
the bit rate is set to 2048 kbps, and the PLI code is set to Core profile and level 2. Further, code such as VOL_width. 
VOL_height. VOP_width. VOP_heigh and bitrate of the integrated objects are corrected. 

[0271] The selector 2133 switches a data path so as to pass the object of the image in Rg. 46A (VO data A) through 
the code length regulator 2134. and not to pass the other objects through the code length regulator 2134. under the 

35 control of the profile controller 21 30. 

[0272] Fig. 54 is a block diagram showing the construction of the coding length regulator 2134. An object decoder 
2141 decodes input video code data, and an object encoder 2142 encodes the decoded data using quantization coef- 
ficients greater than those in the initial encoding. That is, the bit rate can be reduced by re-encoding the objects of the 
image in Fig. 46A by rough quantization. 

40 [0273] The selector 21 35 switches a data path such that the combinations of the objects 2304 and 2305. the objects 
2306 and 2307 and the objects 2308 and 2309 are inputted the integration processor 21 36. under the control of the pro- 
file controller 2130. 

[0274] The detailed construction of the integration processor 21 36 is the same as that of the object integrator 1 1 23 
of the seventh embodiment in Fig. 34. Accordingly, explanations of the construction and processing of the integration 

45 processor 21 36 will be omitted. 

[0275] The code data on synthesized color image information and code data on mask information are inputted via a 
selector 2137 into the multiplexer 21 14. and are multiplexed to code data of one object. The result of the system code 
synthesizer 2131 . the header generated by the header processor 2132. and code data corresponding to the header are 
sequentially inputted via the selector 2137 into the multiplexer 21 14. and multiplexed and outputted. 

50 [0276] Fig. 57 shows the data structure of code data outputted from the image editing unit 2208 of the eleventh 
embodiment. In Fig. 57. in the video object data, the newly-set PLI code (PLIN-I in Fig. 57) is provided at the head. 
Then the VO data A-1-1 corresponding to the background object 2300. and the VO data A-1-23 corresponding to the 
object synthesized from the objects 2302 and 2304 follow. Further, the VO data B-1 -1 corresponding to the background 
object 2301 . the VO data B-1 -23 conesponding to the object synthesized from the objects 2306 and 2307. the VO data 

55 C-1 -1 corresponding to the object 2302, the VO data D-1 -1 corresponding to the background object 2303. and the VO 
data D-1 -23 corresponding to the object synthesized from the objects 2308 and 2309 bllow. That is, seven video 
objects exist in one Visual Object. 

[0277] Tlie code data obtained as above is stored into the storage device 2209. or decoded by the decoder 22 1 0 and 
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displayed as an image as shown in Fig. 48 on the display unit 221 1. 

[0278] As described above, according to the eleventh embodiment, when code data of different profiles and levels are 
synthesized, profile and level are re-defined, and further, the number of objects and the bit rate can be regulated. Thus, 
code data of profile and level desired by the user can be obtained. 
5 [0279] Further, respective objects within an image can be arbitrarily synthesized by integrating objects based on the 
relation among the objects (nodes) described in the system code. That is, a synthesizing procedure closer to a user's 
intuitive synthesizing procedure can be realized. 

Modification of Eleventh Embodiment 

10 

[0280] Fig. 55 is a block diagram showing a modified construction of the code length regulator 21 34 according to the 
eleventh embodiment. If input video code data has been motion compensated, a Huffman decoder 2143 decodes the 
quantization DCT coefficients. The Huffman decoder 21 43 inputs the obtained quantization DCT coefficients into a high 
frequency eliminator 2144, to eliminate high frequency components by replacing the high frequency components with 

15 "0". Then, a Huffman encoder 2143 encodes the output from the high frequency eliminator 2144. That is, the code 
length can be reduced by eliminating high frequency components of the object and re-encoding the data. 
[0281] Fig. 56 is a block diagram showing another modified construction of the code length regulator 2134. H input 
video code data has been motion compensated, the Huffman decoder 2143 decodes the quantization DCT coefficients. 
Then, an inverse quantizer 2146 performs inverse quantization on the obtained quantization DCT coefficients, then a 

20 quantizer 21 47 quantizes the obtained DCT coefficients using quantization coefficients greater than those used in the 
initial coding. Then the Huffman encoder 2145 encodes the data. That is, the code length can be reduced by decoding 
code data of a motion compensated object and re-encoding the data with rough quantization. 
[0282] Note that in the eleventh embodiment, objects to be integrated may be selected using information unique to 
the respective objects, described independently of the user data or code data, in addition to the relation anrong the 

25 objects indicated by nodes. That is, objects having similar attributes ("people", "profession" and the like) may be inte- 
grated. Further, the objects 2307 and 2306 may be integrated based on the attributes indicating actions of "people" 
objects such as "giving an injection" and "taking an Injection" as selection conditions. 

[0283] Further, objects to be integrated may be selected by the combination of plural conditions such as object size, 
code length, location relation and user's instruction. 
30 [0284] Further, in the eleventh embodiment, objects are integrated based on the relation among the objects (nodes) 
described in tiie system code, however, the number of object may be reduced by deleting objects selected based on 
the nodes. In ttiis case, the bit rate can be reduced at the same time. 

[0285] Note that the arrangement of ttie system code memories 2101 to 2104 and the video code memories 2105 to 
2108 is not limited to Uiat in Fig. 47, but more code memories may be provided or one memory may be divided Into a 

35 plurality of areas. Further, a storage medium such as a magnetic disk may be employed. 

[0286] As described above, according to the tenth and eleventti embodiments, one code data based on a predeter- 
mined standard can he obtained by synthesizing a plurality of code data, encoded for a plurality of image information 
(objects). Further, the synthesized code data may de decoded by a decoder of arbitrary coding specifications. Further, 
the number of objects and the code length of the code data can be regulated. 

40 [0287] Further, in tiie above-described respective embodiments, the object 0 is background, however, the object 0 is 
not limited to the badground but may be a moving image of a general object or the like. 

[0288] The present invention can be applied to a system constituted by a plurality of devices (e.g., host computer, 
interface, reader, printer) or to an apparatus comprising a single do/ice (e.g., copy machine, facsimile). 
[0289] Further, the object of tiie present invention can be also achieved by providing a storage medium storing pro- 
45 gram codes for performing the aforesaid processes to a system or an apparatus, reading the program codes with a 
computer (e.g., CPU, MRU) of tiie system or apparatus from tiie storage medium, then executing the program. 
[0290] In tills case, tiie program codes read from the storage medium realize the functions according to the embodi- 
ments, and the storage medium storing tiie program codes constitutes the invention. 

[0291] Further, tiie storage medium, such as a floppy disK a hard disk, an optical disk, a magneto-optical disk, CD- 
50 ROM. CD-R, a magnetic tape, a non-volatile type memory card, and ROM can be used for providing the program codes. 
[0292] Furthermore, besides aforesaid functions according to the above embodiments are realized by executing the 
program codes which are read by a computer, the present invention includes a case where an OS (operating system) 
or the like working on tiie computer performs a part or entire processes in accordance with designations of the program 
codes and realizes functions according to the above embodiments. 
55 [0293] Furthermore, the present invention also includes a case where, after tiie program codes read from tiie storage 
medium are written in a function expansion card which is inserted into the computer or in a memory provided in a func- 
tion expansion unit which is connected to the computer, CPU or the like contained in the function expansion card or unit 
performs a part or entire process in accordance witii designations of the program codes and realizes functions of the 
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above embodiments. 

[0294] The present invention is not limited to the above embodiments and various changes and modifications can be 
made within the spirit and scope of the present invention. Therefore, to appraise the public of the scope of the present 
invention, the following claims are made. 

5 

Claims 

1 . A data processing apparatus having decoding means for decoding code encoded in image object units, said appa- 
ratus comprising: 

10 

detection means for detecting a number of objects included in input code and a number of objects decodable 
by said decoding means; and 

control means for controlling the number of objects of the input code, based on the number of objects and the 
number of decodable objects detected by said detection means. 

15 

2. The apparatus according to claim 1 , wherein if said number of objects Is greater than said number of decodable 
objects, said control means reduces the number of objects included in said code to said number of decodable 
objects. 

20 3. The apparatus according to daim 1 , further comprising: 

extraction means for extracting location information of the objects included in said code; and 

combining means for combining code of a plurality of objects, based on an instruction from said conti^ol means 

and the location information extracted by said exti'action means. 

25 

4. The apparatus according to claim 3, wherein said combining means combines code of a plurality of objects away 
from each other by a distance therebetween, wherein said distance being shorter than other distances between 
objects calculated from said location information. 

30 5. The apparatus according to daim 1 , further comprising: 

extraction means for extracting motion information indicative of motions of the objects included in said code; 
and 

combining means for combining a plurality of objects based on an instruction from said control means and the 
35 motion information extracted by said exti-action means. 

6. The apparatus according to daim 5, wherein said combining means combines code of a plurality of objects having 
the motion information similar to each other. 

40 7. The apparatus according to daim 1 , further comprising: 

extraction means for extracting code lengths of the objects induded in said code; and 

combining means for combining a plurality of objects based on an instruction from said control means and the 

code lengttis extracted by said extraction means. 

45 

8. The apparatus according to daim 7, wherein said combining means combines code of a plurality of objects having 
code lengths shorter than other code lengths. 

9. The apparatus according to daim 1 , further comprising initialization means for determining a coding method for 
50 encoding the input code in frame units, and initializing said control means based on the result of determination. 

10. The apparatus according to claim 9, wherein said initialization means determines whether said code is encoded 
based on interframe correlation or encoded based on inti-aframe information. 

55 11. The apparatus according to claim 1 0, wherein if said code is encoded based on the intraframe information, said ini- 
tialization means initializes said control means. 

12. The apparatus according to daim 1, wherein said code is code of a still image. 
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13. The apparatus according to claim 1, wherein said code is code of a moving image. 

14. A data processing method for decoding code encoded in Image object units, said method comprising the steps of: 

s detecting the number of objects included in input code and the number of objects decodable by said means; 

and 

controlling the number of oljjects of the input code, based on the number of objects and the number of deco- 
dable objects detected at said detection step. 

10 15. The method according to claim 14, wherein at said control step, if said number of objects is greater than said 
number of decodable objects, the number of objects included in said code is reduced to said number of decodable 
objects. 

16. A computer program product comprising a conputer readable medium having computer program code, for execut- 
15 ing data processing which decodes code encoded in image object units, said product comprising: 

detecting procedure code for detecting a number of objects included in input code and a number of decodable 
objects; and 

controlling procedure code for controlling the number of objects of the input code, based on the number of 
20 objects and the number of decodable objects detected in said detection procedure. 

17. A data processing apparatus for processing a data array to reproduce an image with a plurality of coded image 
objects, said apparatus comprising: 

25 detection means for detecting a number of image objects included in said data array; and 

control means for controlling the number of image objects included in said data array based on the number of 
image objects detected by said detection means. 

18. The apparatus according to daim 17, wherein 

30 

if said number of image objects is greater than a predetermined number, said control means reduces the 
number of image objects included in said data array 

19. The apparatus according to claim 18, wherein said predetermined number is a number of objects which can be 
35 processed by decoding means for decoding said data array 

20. The apparatus according to claim 18, wherein said control means reduces the number of image objects by deleting 
an image otiject. 

40 21 . The apparatus according to claim 20, wherein said control means obtains code lengths of the respective image 
objects in said data array and deletes the inrage object based on the obtained code lengths. 

22. The apparatus according to claim 21 , wherein said control means deletes sequentially from an image object having 
the shortest code length. 

45 

23. The apparatus according to claim 20, wherein said control means obtains image sizes of the respective image 
objects in said data array, and deletes the image object based on the obtained image sizes. 

24. The apparatus according to claim 23, wherein said control means deletes sequentially from an image object having 
50 the minimum size. 

25. The apparatus according to claim 20, further comprising setting means for setting a priority order of the image 
objects in said data array 

wherein said control means deletes the image object based on the priority order set by said setting means. 

55 

26. The apparatus according to claim 20, wherein said control means reduces the number of the image objects by inte- 
grating a plurality of image objects. 
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27. The apparatus according to claim 26, further corrprising selection means for selecting a plurality of image objects 
included in said data array, 

wherein said control means integrates the plurality of image objects selected by said selection means. 

5 28. The apparatus according to daim 27, wherein said control means comprises decoding means for decoding the plu- 
rality of image objects selected by said selection means, synthesizing means for synthesizing the plurality of image 
objects decoded by said decoding means, and coding means for encoding an image object synthesized by said 
synthesizing means, so as to integrate the plurality of image objects selected by said selection means. 

70 29. The apparatus according to claim 28, wherein said control means further comprises counting means for counting 
code lengths of the plurality of image objects selected by said selection means, 

and wherein said coding means controls coding parameters based on the results of counting by said count- 
ing means. 

15 30. The apparatus according to claim 27, wherein said control means comprises separation means for separating the 
plurality of image objects selected by said selection means into color information and mask information, color infor- 
mation synthesizing means for synthesizing the color information separated by said separation means, mask infor- 
mation synthesizing means for synthesizing the mask information separated by said separation means, and 
multiplexing means for multiplexing the color information synthesized by said color information synthesizing means 

20 and tiie mask information synthesized by said mask information synthesizing means. 

31. The apparatus according to claim 27, wherein said selection means enables manual selection of a plurality of 
image objects. 

25 32. The apparatus according to claim 27, wherein said selection means selects a plurality of image objects based on 
spatial location information of the respective image objects. 

33. The apparatus accading to claim 27, wherein said selection means selects a plurality of image objects, one of 
which including the other. 

30 

34. Tbe apparatus according to claim 27, wherein said selection means selects a plurality of image objects away from 
each ottier by a distance less than a predetermined value. 

35. The apparatus according to claim 27, wherein said selection means obtains code lengths of the respective image 
35 objects in said data array, and selects an image object based on the obtained code lengths. 

36. The apparatus according to claim 35, wherein said selection means selects sequentially from an image object hav- 
ing ttie shortest code lengtii. 

40 37. The apparatus according to claim 27, wherein said selection means obtains image sizes of the respective image 
objects in said data array, and selects the image object based on ttie obtained image sizes. 

38. The apparatus according to claim 37, wherein said selection means selects sequentially from an image object hav- 
ing ttie minimum image size. 

45 

39. The apparatus according to claim 37, further comprising setting means for setting a priority order of the image 
objects in said data array, 

wherein said selection means selects the image object k>ased on the priority order set by said setting means. 

so 40. The apparatus according to claim 17. wherein said data an-ay is code data adapted to or based on the MPEG4 
standard. 

41. A data processing method for processing a data an-ay to reproduce an image witti a plurality of coded image 
objects, said method comprising the steps of: 

55 

detecting a number of image objects included in said data array; and 

controlling the number of image objects included in said data array based on ttie number of image objects 
detected at said detection step. 
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42. A computer program product comprising a computer readable medium having computer program code, for execut- 
ing data processing which processes a data an-ay to reproduce an image with a plurality of coded image objects, 
said product comprising: 

5 detection procedure code for detecting a number of image objects included in said data array; and 

control procedure code for controlling the number of image objects included in said data array based on the 
number of image objects detected in said detection procedure. 

43. A data processing apparatus comprising: 

10 

input means for inputting a plurality of image data to construct one frame, wherein said image data respectively 
including N image objects, where N^1 holds; and 

generation means for generating image data having M image objects, where holds, constructing said one 
frame, by integrating at least a part of said N image objects based on additional information indicative of rela- 
ys tion among the image objects. 

44. The apparatus according to claim 43, wherein said M image objects are an appropriate number of image objects 
to be processed in accordance with a predetermined coding standard. 

20 45. The apparatus according to claim 44, wherein if the number of image objects included in said image data inputted 
by said input means is greater than the number of objects defined by said predetermined coding standard, said 
generation means performs integration processing. 

46. The apparatus according to daim 43, wherein said image data inputted by said input means is adapted to or based 
25 on the MPEG4 standard. 

47. A data processing method comprising the steps of: 

inputting a plurality of image data to construct one frame, wherein said image data respectively including N 
30 image objects, where N^l holds; and 

generating image data having M image objects, where holds, constructing said one frame, by integrating 
at least a part of said N image objects based on additional information indicative of relation among the image 
objects. 

3S 48. A computer program product comprising a computer readable medium having computer program code, for execut- 
ing data processing, said product comprising: 

input procedure code for inputting a plurality of image data to construct one frame, wherein said image data 
respectively including N image objects, where holds; and 
40 generation procedure code for generating image data having M image objects, where M^l holds, consfructing 

said one frame, by integrating at least a part of said N image objects based on additional information indicative 
of relation among the image objects. 

49. A data processing apparatus for processing a data array to reproduce one frame image witii a plurality of coded 
45 image objects, said apparatus comprising: 

input means for inputting a plurality of data arrays; 

insb-uction means for insb-ucting synthesizing of a plurality of data arrays inputted by said input means; 
designation means for designating coding specifications of a processed data array; 
50 control means for controlling information amounts of tiie plurality of data arrays inputted by said input means, 

based on tiie coding specifications designated by said designation means; and 

synthesizing means for syntiiesizing ttie plurality of data arrays with information amounts confrolled by said 
control means, based on ttie coding specifications designated by said designation means. 

55 50. The apparatus according to claim 49, wherein said synthesizing means sets coding specifications of synthesized 
data array to tiie coding specifications designated by said designation means. 

51 . The apparatus according to daim 49, wherein said instruction means instructs synthesizing including change of 
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spatial locations of the image objects included In said data array. 

52. The apparatus according to claim 49, wherein if the number of image objects included in the data array synthesized 
by said synthesizing means is greater than the number of image objects according to the coding specifications des- 
ignated by said designation means, said control means reduces the number of image objects included in the data 
an-ay. 

53. The apparatus according to claim 52, wherein said control means reduces the number of image objects by integrat- 
ing image objects included in the data an-ay. 

54. The apparatus according to claim 53, wherein said control means selects image objects to be integrated, based on 
relation among the image objects included in said data an-ay. 

55. The apparatus according to claim 54, wherein the relation among the image objects is represented by node infor- 
mation. 

56. The apparatus according to claim 52, wherein said control means reduces the number of image objects by deleting 
at least one image object included in said data array 

57. The apparatus according to claim 49, wherein if a code length of an image object included in the data array synthe- 
sized by said synthesizing means is longer than a value based on the coding specifications designated by said des- 
ignation means, said control means reduces the code length of the image object. 

58. The apparatus according to claim 49, wherein said conti-ol means reduces a code length, sequentially from an 
image object with the highest coding specifications, among the plurality of data arrays to be synthesized, included 
in a data array. 

59. The apparatus according to claim 49, wherein said control means reduces a code length of an image object by per- 
forming re-encoding with rough quantization coefficients, on a data array. 

60. The apparatus according to claim 49, wherein said control means reduces a code length of an image object by per- 
forming re-encoding which eliminates high frequency components, on a data array. 

61 . The apparatus according to claim 49, further comprising transmission means for ti-ansmitting the data array synthe- 
sized by said synthesizing means. 

62. The apparatus according to claim 49, further comprising decoding means for decoding the data array synthesized 
by said synthesizing means. 

63. The apparatus according to daim 58, further comprising display means for displaying an image represented by ttie 
data array decoded by said decoding means. 

64. The apparatus according to claim 49, wherein said coding specifications are adapted to or based on the MPEG4 
standard. 

65. The apparatus according to daim 49, wherein said data array is adapted to or based on the MPEG4 standard. 

66. A data processing method for processing a data array to reproduce one frame image with a plurality of coded image 
objects, said method comprising the steps of: 

Inputting a plurality of data arrays; 

insfructing synthesizing of a plurality of data an-ays inputted at said input step; 
designating coding specifications of a processed data array; 

controlling information amounts of the plurality of data arrays inputted at said input step, based on the coding 
specifications designated at said designation step; and 

synthesizing the plurality of data arrays with information amounts controlled at said confol step, based on the 
coding specifications designated at said designation step. 
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67. A computer program product comprising a corrputer readable medium having computer program code, for execut- 
ing data processing which processes a data array to reproduce one frame image with a plurality of coded image 
objects, said product comprising; 

input procedure code for inputting a plurality of data arrays; 

instruction procedure code for instructing synthesizing of a plurality of data an-ays inputted in said input proce- 
dure; 

designation procedure code for designating coding specifications of a processed data array; 

control procedure code for controlling information amounts of the plurality of data arrays inputted in said input 

procedure, based on the coding specifications designated in said designation procedure; and 

synthesizing procedure code for synthesizing the plurality of data arrays with information amounts controlled in 

said control procedure, based on the coding specifications designated in said designation procedure. 
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(54) Object data processing apparatus, object data recording apparatus, data storage media, 
data structure for transmission 

(57) An object data processing apparatus for 
decoding N pieces of coded data (N = positive integer) 
obtained by compressively coding N pieces of object 
data which constitute individual data to be recorded or 
transmitted and have a hierarchical structure, for each 
object data. This apparatus includes hierarchical infor- 
mation extraction means for extracting hierarchical 
information showing the hierarchical relationship of the 
N pieces of object data, according to the coded data; 
and table creation means fa creating, according to the 
hierarchical information, an object table on which the 
respective object data are correlated with coded data of 
the respective object data. Therefore, the apparatus can 
perform extraction, selection, or retrieval of coded data 
conesponding to a specific object at high speed, and 
this enables the user to edit or replace the object data in 
short time with high controllability. 



CM 
< 
O 
CO 
CO 



Q. 

LU 



Copied from 09964647 on 02/18/2005 



EP0 862 330 A2 



2 



Description 

FIELD OF THE INVENTION 

The present invention relates to object data s 
processing apparatus, object data recording apparatus, 
data storage media, and data structure for transmission. 
More particularly, the invention relates to an apparatus 
for decoding conpressed data, such as compressed 
digital video data, digital audio data, and program data, 10 
an apparatus for selecting desired data from the com- 
pressed data, an apparatus for recording the com- 
pressed data, a medium storing the compressed data, 
an apparatus for outputting the compressed data, and a 
data structure for transmitting the compressed data. is 

BACKGROUND OF THE INVENTION 

In recent years, with the progress in information 
compression technology, a digital video/audio service 20 
providing video information and audio information by 
digital signals has been put to practical use for broad- 
casting media, such as ground broadcasting, satellite 
broadcasting, and CATV. 

Under the existing circumstances, as a compres- as 
sive coding method for the next generation, an object 
coding method has attracted attention. This object cod- 
ing method is not to uniformly compress the whole 
image, i.e., video data con^esponding to a single image, 
but to compress video data corresponding to a single 30 
image in units of individual objects constituting the 
image while paying attention to the contents of the 
image. 

When video data corresponding to a single image is 
subjected to the compressive coding in object units. 35 
compressed (coded) video data is separable corre- 
sponding to the respective objects, whereby a specific 
object in the image can be extracted or replaced. 

Meanwhile, as a method of implementing a data 
transmission format for making the best use of the 40 
object coding method, a method of multiplexing com- 
pressed video data, audio data, and other digital data is 
discussed. 

There is MPEG4 as an international standard of a 
method of multiplexing data compressed by the object 45 
coding method (ISO/IEC JTC1/SC29WG11 N1483, 
"System Working Draft", Novemkjer 1996). Hereinafter, 
a description is given of the data multiplexing method 
based on MPEG 4 and a method for reproducing the 
multiplexed data, with reference to figures. so 

Figure 1 8 is a diagram for explaining the object cod- 
ing method. In the figure, reference numeral 120 desig- 
nates a scene (an image) in a series of images obtained 
from video data with audio. This scene 1 20 is composed 
of a plurality of objects (sub-images) making a hierarchi- ss 
cal structure. To be specific, the scene 1 20 is composed 
of three objects: a background image (background) 121 , 
a moving object 122 that moves in the background, and 



a background audio 123 attendant on the background. 
The moving object 122 is composed of four objects: a 
first wheel 124, a second wheel 125, a body 126, and a 
moving object audio 127 attendant on the moving 
object. Further, the object of body 126 is composed of 
two objects: a window 128 and the other part 129. In the 
hierarchical structure, the objects 121-123 belong to 
the uppermost first layer LI, the objects 124-127 
belong to the second layer L2 lower than the first layer 
LI , and the objects 1 28 and 1 29 belong to the third layer 
L3 lower than the second layer L2. 

In the object coding method, scene data con-e- 
sponding to the scene 120 are compressively coded in 
units of the lowermost objects constituting the scene 
120. In other words, scene data corresponding to the 
scene 120 are compressively coded for each of the 
ot)jects121, 123, 124, 125, 127, 128 and 129. 

Rgure 19 is a diagram for explaining a data struc- 
ture for transmitting coded data con-esponding to the 
respective objects mentioned above, which is obtained 
by performing object coding to the scene data of the 
scene 120. 

In figure 19, MEg shows a multiplexed bit stream 
having a prescribed format, obtained by multiplexing 
coded data of the respective objects and auxiliary data. 
This multiplexed bit stream MEg is transmitted as coded 
data corresponding to the scene data. 

The multiplexed bit stream MEg is partitioned into 
plural packets in prescribed units, i.e., each packet hav- 
ing prescribed number of bytes, and coded data of the 
respective objects are allocated to the packets having 
their own values (SLC=1, 2, ...) as logical channels 
(LC). 

To be specific, in the multiplexed bit stream MEg 
shown in figure 1 9, coded video data of object [1 ] is allo- 
cated to packets Pa3 and Pa6 having a logical channel 
SLC=3, coded video data of object [2] is allocated to 
packets Pa5 and Pa7 having a logical channel SLC=4, 
and coded audio data of object [3] is allocated to a 
packet 4 having a logical channel SLC=5. Information 
relating to the byte number of packet when multiplexed, 
the logical channel LC of each packet, and the packet 
transmission order is allocated as control information to 
a packet having another logical channel (not shown) for 
transmission. 

The objects [1] and [2] are the background image 
121 and the moving object 122 shown in figure 18, 
respectively, and the object [3] is the background audio 
123 shown in figure 18. 

In the multiplexed bit stream MEg, allocated to the 
packet Pal of logical channel SLC=1 is information 
relating to a scene composition method for regenerating 
the scene composed of the respective objects (compo- 
sition stream), and allocated to the packet Pa2 of logical 
channel SLC=2 is information showing how the coded 
data of the respective objects are multiplexed (stream 
association table). 

Accordingly, when a plurality of coded data 
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obtained by object coding are multiplexed and transmit- 
ted, with the coded data of the respective objects, the 
composition stream showing the structure of a scene 
composed of the objects and the stream association 
table showing the correlation of the transmitted streams 
(each stream being a series of coded data correspond- 
ing to each object) are transmitted simultaneously. 

Figure 20 is a diagram for explaining a scene 
description according to the composition stream, illus- 
trating a description SD corresponding to the single 
image (scene) 120 shown in figure 18. 

In the scene description SD according to the com- 
position stream, the image 120 is shown by Scene 140, 
and the fact that the image 120 shown by Scene 140 is 
composed of the background image 121, the moving 
object 122, and the background audio 123 is shown by 
Video(l) 141, Node(1) 142, and Audio(l) 143, respec- 
tively. Here, Scene 140, Video(l) 141, Node(1) 142, and 
Audio(l) 143 are descriptors describing the image 120, 
the background image 121, the moving object 122, and 
the background audio 123 shown in figure 18. respec- 
tively. 

Further, in the scene description SD, the fact that 
the moving object 122 shown by Node(1) 142 is com- 
posed of the first wheel 1 24, the second wheel 1 25, the 
body 126, and the moving object audio 127 is shown by 
Video(2) 144, Video(3) 145, Node(2) 146, and Audio(2) 
147, respectively, which are descriptors corresponding 
to these objects. 

Further, the fact that the body 126 shown by 
Node(2) 146 is composed of the window 128 and the 
other part 129 is shown by Video(4) 148 and Video(5) 
149, respectively, which are descriptors corresponding 
to these objects. 

Each of the descriptors is given a stream index 
(stream id) for identifying a stream corresponding to 
coded data of each object in the multiplexed bit stream 
MEg. To be specific, as shown in figure 20, stream indi- 
ces Sid=1~Sid=5 are given to the descriptors 141-145, 
respectively, and stream indices Sid=6, Sid=7, and 
Sid=8 are given to the descriptors 148, 149, and 147, 
respectively. Sid is a specific number of each stream id. 

Accordingly, it can be seen from the scene descrip- 
tion SD according to the composition stream that a 
scene is composed of what kinds of objects. However, 
the scene desaiption SD according to the composition 
stream does not describe how the coded data corre- 
sponding to the respective objects are multiplexed in the 
actual multiplexed bit stream MEg. 

Figure 21 is a diagram for explaining the stream 
association table AT. 

The stream association table AT shows the relation- 
ship between the stream con-esponding to coded data 
of each object (i.e., a series of coded data correspond- 
ing to each object) and the logical channel (LC) specify- 
ing each packet which is the partition unit of coded data 
when multiplexed. To be specific, on this table AT, the 
stream indices (id) of the respective streams, the logical 



channel values (LC) corresponding to the respective 
streams, and the logical channel values (LC) corre- 
sponding to upper streams of the respective streams 
are con-elated with each other. Here, the logical channel 

5 LC corresponding to the upper stream of the streams 
(Sid=1 ~3) corresponding to the objects 1 21 ~ 1 23 of the 
first layer LI corresponds to the logical channel LC 
(SLC=2) of the packet Pa2 to which the stream associa- 
tion table is allocated. 

10 Accordingly, with reference to this table AT, the log- 
ical channel LC corresponding to each stream and the 
logical channel LC of its upper stream (host stream) can 
be specified. 

As described above, since the stream indices (Sid) 
15 are added to the descriptors 1 41 ~ 1 45 and 1 47- 1 49 of 
the respective objects in the scene desaiption SD 
according to the composition stream shown in figure 20, 
the respective objects can be identified by the stream 
indices (Sid) from the composition stream and, there- 
to fore, the composition stream can be correlated with the 
stream association table shown in figure 21. 

As described above, the multiplexed bit stream 
MEg includes the composition stream and the stream 
association table together with the coded data con-e- 
ss sponding to the respective objects. Therefore, when the 
coded data of the respective objects are reproduced by 
decoding according to the multiplexed bit stream MEg, it 
is possible to extract or retrieve coded data of a specific 
object designated according to the composition stream 
30 and the stream association table. This enables, for 
example, edition of the objects 121 to 129 constituting 
the scene 120 on the reproduction end. 

In the multiplexed bit stream format according to the 
prior art object coding, the scene description is 
35 expressed as irrformation (composition stream) sepa- 
rated from information relating to the multiplexed state 
of the respective coded data and the logical channels 
corresponding to the respective streams (stream asso- 
ciation table). The reason is as follows. In order to real- 
40 ize exchange of the contents of streams corresponding 
to the respective objects and to facilitate interface 
between the multiplexed bit stream and applications 
treating this multiplexed bit stream without changing the 
scene composition (i.e., the hierarchical structure of the 
45 objects constituting a scene), the structure for multiplex- 
ing, which depends on the physical layer of the multi- 
plexed bit stream, must be separated from main 
information (coded data) included in the multiplexed bit 
stream. 

50 However, the multiplexed bit stream format accord- 
ing to the prior art has the following drawbacks. 

A great advantage of object coding resides in that it 
enables extraction of coded data of a specific object 
from the multiplexed bit stream, and retrieval of a spe- 

55 cif ic object on the data base containing the multiplexed 
bit stream. 

However, in order to recognize coded data of indi- 
vidual objects from the multiplexed bit stream MEg of 
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the above-mentioned structure, a complicated proce- 
dure is required as follows. For example, to recognize 
coded data of lower-layer objects from plural objects 
having a hierarchical structure, initially, the scene 
description according to the composition stream 
included in the multiplexed bit stream MEg is interpreted 
to find an object corresponding to a node, and a stream 
con-esponding to a lower object being a component of 
the object (node) is specified. Then, the stream associ- 
ation table AT is interpreted and, according to the 
stream id of the specified stream, a logical channel LC 
corresponding to the stream id is found. Thereby, coded 
data of the specified object can be extracted from the 
multiplexed bit stream MEg. 

Furthermore, since the hierarchical relationship of 
the streams corresponding to the respective objects can 
be seen from the stream association table AT, it is pos- 
sible to analogize coded data of a specific object 
according to the stream association table AT alone, but 
this analogy takes time and is not reliable. 

That is, on the stream association table AT, informa- 
tion relating to objects as nodes is not clearly defined. In 
addition, since this table AT does not show the type of 
stream con-esponding to coded data (for example, 
whether a stream corresponds to video data or audio 
data), other information such as the composition stream 
should be refen^ed to. Further, for each stream, only its 
upper stream is known from the table AT. So, it is impos- 
sible to uniquely know that coded data of each object is 
composed of which stream, and interpretation takes 
time. 

For example, in the scene description SD according 
to the composition stream shown in figure 20, although 
Node(2) 146 corresponding to the object 126 exists, a 
stream corresponding to Node(2) does not exist. So. on 
the stream association table AT shown in figure 21 , an 
entry corresponding to Node(2) (i.e., stream id, LC cor- 
responding to the stream, and LC con-esponding to the 
stream's upper-layer stream) does not exists. 

Accordingly, in order to extract the object 126 corre- 
sponding to Node(2), initially, stream indices Od) corre- 
sponding to the lower-layer objects 128 and 129 of 
Node(2) 146 must be decided on the basis of the scene 
description SD according to the composition stream 
(refer to figure 20) and, thereafter, the logical channels 
(LC) of packets containing the streams having the 
decided stream indices must be defined on the stream 
association table AT (refer to figure 21). 

Further, there is a case where coded data corre- 
sponding to plural objects are transmitted without being 
multiplexed in a particular transmission medium, such 
as computer network (internet). In this case, the bit 
stream has a data structure including no logical chan- 
nels, and does not include the stream association table. 

In this case, detection of a specific object from the 
bit stream is carried out by interpreting the hierarchical 
structure of the objects on the basis of the scene 
description SD according to the composition stream. 



However, when the number of the objects increases 
considerably, it requires a lot of time to interpret the hier- 
archical structure of the objects on the basis of the com- 
position stream, resulting in poor controllability in 
5 replacement or edition of objects in a scene. 

SUMMARY OF THE INVENTION 

It is an object of the present invention to provide an 
10 object data processing apparatus that speedily per- 
forms extraction, selection, or retrieval of coded data 
corresponding to a specific object from coded data cor- 
responding to plural objects, whereby the user can edit 
or replace object data in short time with high controlla- 
15 bility 

It is another object of the present invention to pro- 
vide an object data recording apparatus that records 
coded data corresponding to plural objects so that 
coded data of a specific object among the plural objects 
20 can be extracted, selected, or retrieved easily and 
speedily. 

It is still another object of the present invention to 
provide a data storage medium containing coded data 
having a data structure, which data structure realizes 

25 simple and speedy extraction, selection, or retrieval of 
coded data of a specific object from coded data con-e- 
sponding to plural objects. 

It is a further object of the present invention to pro- 
vide a data structure for transmission that realizes sim- 

30 pie and speedy extraction, selection, or retrieval of 
coded data of a specific object from coded data corre- 
sponding to plural objects. 

Other objects and advantages of the invention will 
become apparent from the detailed description that fol- 

35 lows. The detailed description and specific embodi- 
ments described are provided only for illustration since 
various additions and modifications within the scope of 
the invention will be apparent to those of skill in the art 
from the detailed description. 

40 According to a first aspect of the present invention, 
there is provided an object data processing apparatus 
for decoding N pieces of coded data (N = positive inte- 
ger) obtained by compressively coding N pieces of 
object data which constitute individual data to be 

45 recorded or transmitted and have a hierarchical struc- 
ture, for each object data. This apparatus includes hier- 
archical information extraction means for extracting 
hierarchical information showing the hierarchical rela- 
tionship of the N pieces of object data, according to the 

50 coded data; and table creation means for creating, 
according to the hierarchical information, an object table 
on which the respective object data are correlated with 
coded data corresponding to the respective object data. 
Therefore, extraction, selection or retrieval of coded 

55 data of a specific object can be carried out easily and 
speedily, and this enables the user to edit or replace the 
object data in short time with high controllability. 

According to a second aspect of the present inven- 
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tion, there is provided an object data processing appa- 
ratus for decoding N pieces of coded data (N = positive 
integer) obtained by compressively coding scene data 
corresponding to one scene, for eacfi of N pieces of 
objects constituting the scene. This apparatus includes 
hierarchical information extraction means for extracting 
hierarchical Information showing the hierarchical rela- 
tionship of the respective objects constituting the scene, 
according to the coded data; and table creation means 
for creating, according to the hierarchical information, 
an object table on which the respective objects are cor- 
related with coded data corresponding to the respective 
objects. Therefore, on the decoder side, extraction, 
selection or retrieval of a specific object from plural 
objects (video and audio) constituting one scene can be 
carried out easily and speedily wHh reference to the 
object table. 

According to a third aspect of ttie present Invention, 
in the object data processing apparatus according to the 
second aspect, the hierarchical Information exb-action 
means Is consti'ucted so that it extracts priority informa- 
tion showing the priority order of the respective objects, 
according to the coded data. In addition to the hierarchi- 
cal information; and the table creation means is con- 
sti'ucted so that it creates, according to the hierarchical 
information and the priority information, an object iaUte 
on which the respective objects are correlated with 
coded data corresponding to the respective objects and 
the priority order of the respective objects are shown. 
Therefore, when ttie throughput of the decoding appara- 
tus is low and the apparatus cannot decode coded data 
of all objects, only objects having priorities higher than a 
prescribed priority are subjected to decoding. 

According to a fourth aspect of the present Inven- 
tion, the object data processing apparatijs according to 
the second aspect further Includes identification infor- 
mation detection means for detecting identification infor- 
mation for identifying coded data of a specific object 
designated, with reference to the object table; and 
decoding means for extracting coded data of the spe- 
cific object from the N pieces of coded data according to 
the identification information, and decoding ttie 
exti^acted coded data. Therefore, In the decoding appa- 
ratus, reti-ieval of an object specified by the user is facil- 
itated. 

According to a fifth aspect of ttie present Invention, 
there is provided an object data processing apparatus 
for decoding multiplexed data including N pieces of 
coded data (N = positive integer) obtained by compres- 
sively coding scene data corresponding to one scene, 
for each of N pieces of objects constituting the scene. 
This apparatus includes hierarchical information exttac- 
tion means for extracting hierarchical information show- 
ing ttie hierarchical relationship of the N pieces of 
objects constituting the scene, according to information 
showing ttie correlation of the respective coded data 
and included in ttie multiplexed data: and table creation 
means for creating, according to ttie hierarchical infor- 



mation, an object table on which the respective objects 
are correlated witti coded data corresponding to ttie 
respective objects. Therefore, extraction, selection or 
retrieval of a specific object from plural objects (video 

5 and audio) constituting one scene can be carried out 
easily and speedily, on the basis of the multiplexed data, 
with reference to the object table. 

According to a sixth aspect of the present invention. 
In the object data processing apparatus according to the 

10 fifth aspect, ttie hierarchical information extraction 
means Is constructed so that It exti-acts priority informa- 
tion showing ttie priority order of ttie respective objects, 
according to the multiplexed data, in addition to the hier- 
archical Information; and the table creation means Is 

IS constructed so that it creates, according to the hierar- 
chical information and ttie priority infamation, an object 
table on which ttie respective objects are correlated writti 
coded data con-esponding to ttie respective objects and 
ttie priority order of ttie respective objects are shown. 

20 Therefore, when the ttiroughput of the decoding appara- 
tus is low and the apparatus cannot decode coded data 
of all objects, only objects having priorities higher than a 
prescribed priority are subjected to decoding. 

According to a seventh aspect of the present inven- 
ts tion, ttie object data processing apparatus according to 
ttiefiftti aspect furttier Includes identification information 
detection means for detecting Identification information 
for identifying coded data of a specific object desig- 
nated, witti reference to the object table; and decoding 

30 means for extracting coded data of ttie specific object 
from the multiplexed data according to ttie Identification 
Information, and decoding the extracted coded data. 
Therefore, in the decoding apparatus, ref leva! of an 
object specified by the user Is facilitated. 

35 According to an eighth aspect of ttie present inven- 
tion, there Is provided an object data processing appa- 
ratus for selecting coded data of a specific object data 
from N pieces of coded data (N = positive Integer) 
obtained by compressively coding N pieces of object 

40 data which constitute Individual data to be recorded or 
b-ansmltted and have a hierarchical structure, for each 
object data. This apparatus includes hierarchical Infor- 
mation exti^action means for extracting hierarchical 
Information showing ttie hierarchical relationship of ttie 

45 N pieces of object data, according to ttie coded data; 
and table creation means for creating, according to tiie 
hierarchical information, an object table on which ttie 
respective object data are correlated with coded data 
corresponding to ttie respective object data. This appa- 

50 ratus selects coded data of a specific object data from 
the N pieces of coded data with reference to the object 
table and outputting the selected coded data. Therefore, 
selection of coded data of a specific object can be car- 
ried out easily and speedily, and this enables the user to 

55 edit, replace, or retrieve ttie object data in short time 
with high conti-ollabillty. 

According to a ninth aspect of the present inven- 
tion, there Is provided an object data processing appa- 
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ratus for selecting coded data of a specific object from N 
pieces of coded data (N = positive integer) obtained by 
compressively coding scene data corresponding to one 
scene, for each of N pieces of objects constituting the 
scene. This apparatus includes hierarchical information 
extraction means for extracting hierarchical information 
showing the hierarchical relationship of the respective 
objects constituting the scene, according to the coded 
data; and table creation means for creating, according 
to the hierarchical information, an object table on which 
the respective objects are con'elated with coded data 
corresponding to the respective objects. This apparatus 
selects coded data of a specific object from the N pieces 
of coded data with reference to the object table and out- 
putting the selected coded data. Therefore, on the 
decoder side, selection of a specific object from plural 
objects (video and audio) constituting one scene can be 
can-led out easily and speedily. 

According to a tenth aspect of the present inven- 
tion, there is provided an object data processing appa- 
ratus for selecting coded data of a specific object from 
multiplexed data including N pieces of coded data (N = 
positive integer) obtained by compressively coding 
scene data corresponding to one scene, for each of N 
pieces of objects constituting the scene. This apparatus 
Includes hierarchical information extraction means for 
extracting hierarchical Information showing the hierar- 
chical relationship of the respective objects constituting 
the scene, according to information showing the corre- 
lation of the respective coded data and included In the 
multiplexed data; and table creation means for creating, 
according to the hierarchical information, an object table 
on which the respective objects are correlated with 
coded data corresponding to the respective objects. 
This apparatus selects coded data of a specific object 
from the multiplexed data with reference to the object 
table and outputting the selected coded data. Therefore, 
on the basis of the multiplexed data, selection of a spe- 
cific object from plural objects (video and audio) consti- 
tuting one scene can be carried out easily and speedily. 

According to an eleventh object of the present 
invention, there is provided an object data recording 
apparatus having a data storage for storing data, and 
recording N pieces of coded data (N = positive Integer) 
In the data storage, which coded data are obtained by 
compressively coding N pieces of object data which 
constitute individual data to be recorded or transmitted 
and have a hierarchical structure, for each object data. 
This apparatus Includes hierarchical Information extrac- 
tion means for extracting hierarchical information show- 
ing the hierarchical relationship of the respective object 
data, according to the coded data; and table creation 
means for creating, according to Uie hierarchical Infor- 
mation, an object table on which the respective object 
data are correlated with coded data corresponding to 
the respective object data. This apparatus records the N 
pieces of coded data and the object table con-esponding 
to these coded data in tine data storage. Therefore, 



extraction, selection or retrieval of coded data of a spe- 
cific object can be carried out easily and speedily with 
reference to the object table. 

According to a twelfth aspect of flie present Inven- 

5 tion, there is provided an object data recording appara- 
tus having a data storage for storing data, and recording 
N pieces of coded data (N = positive integer) in the data 
storage, which coded data are obtained by compres- 
sively coding scene data corresponding to one scene, 

to for each of N pieces of objects constituting the scene. 
This apparatus includes hierarchical information extrac- 
tion means for extracting hierarchical Information show- 
ing the hierarchical relationship of the respective 
objects constituting the scene, according to the coded 

15 data; and table creation means for creating, according 
to the hierarchical Information, an object table on which 
the respective objects are correlated with coded data 
corresponding to the respective objects. This apparatus 
records the N pieces of coded data and the object table 

20 corresponding to these coded data In the data storage. 
Therefore, on the decoder side, exti-action, selection or 
retrieval of a specific object from plural objects (video 
and audio) constituting one scene can be performed 
easily and speedily with reference to the object table. 

25 According to a thirteenth aspect of tine present 
Invention, in the object data recording apparatus 
according to the twelfth aspect, the object table corre- 
sponding to the N pieces of coded data is added to the 
N pieces of coded data when being recorded. There- 

30 fore, management of tiie recorded object table Is facili- 
tated. 

According to a fourteenth aspect of the present 
invention. In the object data recording apparatus 
according to ttie twelfth aspect, the object table corre- 

35 spending to the N pieces of coded data Is separated 
from tine N pieces of coded data when being recorded. 
Therefore, regardless of the size of the coded data, 
recording of the object table to a storage medium Is car- 
ried out with high reliability. 

40 According to a fifteenth aspect of the present inven- 
tion, there Is provided an object data recording appara- 
tus having a data storage for storing data, and recording 
multiplexed data including N pieces of coded data (N = 
positive integer) in the data storage, which coded data 

45 are obtained by compressively coding scene data corre- 
sponding to one scene, for each of N pieces of objects 
constituting the scene. This apparatus includes hierar- 
chical Information extraction means for exfracting hier- 
archical Information showing the hierarchical 

50 relationship of the respective objects constituting the 
scene, according to information showing the correlation 
of the respective coded data and included in the multi- 
plexed data; and table creation means for creating, 
according to tiie hierarchical information, an object table 

55 on which the respective objects are correlated with 
coded data corresponding to the respective objects. 
This apparatus records the multiplexed data and the 
object table corresponding to the multiplexed data in tine 
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data storage. Therefore, on the decoder side, extrac- 
tion, selection or retrieval of a specific object from plural 
objects (video and audio) constituting one scene can be 
performed easily and speedily with reference to the 
object table. 5 

According to a sixteenth aspect of the present 
Invention, in the object data recording apparatus 
according to the fifteenth aspect, the object table corre- 
sponding to the multiplexed data Is added to the multi- 
plexed data when being recorded. Therefore, 10 
management of the recorded object table is facilitated. 

According to a seventeenth aspect of the present 
invention, in the object data recording apparatus 
according to the fifteenth aspect, the object table corre- 
sponding to the multiplexed data is separated from the 15 
multiplexed data when being recorded. Therefore, 
regardless of the size of the multiplexed data, recording 
of the object table in a storage medium is carried out 
with high reliability 

According to an eighteenth aspect of the present 20 
invention, there is provided a data storage medium con- 
taining relevant data relating to individual data to be 
recorded or transmitted, wherein the relevant data 
includes an object table on which N pieces of object 
data (N = positive integer) constituting the individual 25 
data and having a hierarchical structure are correlated 
witii N pieces of coded data obtained by compressively 
coding the respective object data. Therefore, extraction, 
selection or retrieval of specific object data from the 
individual data can be earned out easily and speedily 30 
with reference to the object table. 

According to a nineteenth aspect of tiie present 
invention, there is provided a data storage medium con- 
taining relevant data corresponding to one scene, 
wherein the relevant data includes an object table on 35 
which N pieces of coded data (N = positive integer) 
obtained by compressively coding scene data corre- 
sponding to one scene for each of N pieces of objects 
constituting the scene are correlated with ttie respective 
objects. Therefore, on Uie decoder side, exb-action, 40 
selection or reti-ieval of a specific object from plural 
objects (video and audio) constituting one scene can be 
performed easily and speedily with reference to the 
object table. 

According to a twentietii aspect of the present 45 
invention, there is provided an object data processing 
apparatus for outputting N pieces of coded data (N = 
positive integer) obtained by compressively coding N 
pieces of object data which constitute individual data to 
be recorded or transmitted and have a hierarchical so 
structure, for each object data. This apparatus includes 
hierarchical information extraction means for extracting 
hierarchical information showing the hierarchical rela- 
tionship of the N pieces of object data, according to the 
coded data; and table creation means for creating, 55 
according to the hierarchical information, an object table 
on which the respective object data are con-elated with 
coded data con-esponding to the respective object data. 



This apparatus outputs the N pieces of coded data to 
which the object table corresponding to these coded 
data is added. Therefore, it is not necessary to create 
an object table on the decoder side, whereby edition, 
replacement or retrieval of object data can be per- 
formed by a simple structure, in short time with high 
controllability. 

According to a twenty-first aspect of the present 
invention, there is provided an object data processing 
apparatus for outputting N pieces of coded data (N = 
positive integer) obtained by compressively coding 
scene data corresponding to one scene, fa each of N 
pieces of objects constituting the scene. This apparatus 
includes hierarchical information extraction means for 
extracting hierarchical information showing the hierar- 
chical relationship of the respective objects constituting 
the scene, according to the coded data; and table crea- 
tion means for creating, according to the hierarchical 
information, an object table on which the respective 
objects are con-elated witii coded data corresponding to 
the respective objects. This apparatus outputs the N 
pieces of coded data to which tiie object table con-e- 
sponding to tiiese coded data is added. Therefore, it is 
not necessary to create an object table on the decoder 
side, whereby edition, replacement or retrieval of 
objects (video and audio) constituting one scene can be 
performed by a simple structure, in short time with high 
controllability. 

According to a twenty-second aspect of the present 
invention, there is provided an object data processing 
apparatus for decoding data output from ttie object data 
processing apparatus according to the twenty-first 
aspect. This apparatus includes data separation means 
for separating the object table from the output data; and 
table storage means for storing the separated object 
table. In this apparatus, decoding of the coded data cor- 
responding to tiie respective objects is controlled using 
the information shown in the object table stored in the 
table storage means. Therefore, it is possible to realize 
a decoding apparatus of simple structure that can per- 
form edition, replacement or retrieval of objects (video 
and audio) constitijting one scene in short time with 
high controllability. 

According to a twenty-third aspect of tine present 
invention, there is provided an object data processing 
apparatus for outputting multiplexed data including N 
pieces of coded data (N = positive integer) obtained by 
compressively coding scene data corresponding to one 
scene, for each of N pieces of objects constituting the 
scene. This apparatus includes hierarchical information 
extraction means for extracting hierarchical information 
showing tiie hierarchical relationship of tiie respective 
objects constituting the scene, according to information 
showing the con-elation of the respective coded data 
included in tiie multiplexed data; and table creation 
means for creating, according to the hierarchical infor- 
mation, an object table on which the respective objects 
are correlated with coded data corresponding to the 
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respective objects. This apparatus outputs the multi- 
plexed data to which the object table corresponding to 
the multiplexed data is added. Therefore, it is not neces- 
sary to create an object table on the decoder side, 
whereby edition, replacement or retrieval of objects 
(video and audio) constituting one scene can be per- 
formed by a simple structure, In short time with high 
controllability. 

According to a twenty-fourth aspect of the present 
invention, there is provided an object data processing 
apparatus for decoding data output from the object data 
processing apparatus according to the twenty-third 
aspect. This apparatus includes data separation means 
for separating the object table from the output data; and 
table storage means for storing the separated object 
table. In this apparatus, decoding of the coded data cor- 
responding to the respective objects is controlled using 
the information shown in the object table stored in the 
table storage means. Therefore, it is possible to realize 
a decoding apparatus of simple structure that can per- 
form edition, replacement or retrieval of objects (video 
and audio) constituting one scene In short time with 
high conb-ollability. 

According to a twenty-fifth aspect of the present 
invention, there is provided a data structure for transmit- 
ting N pieces of coded data (N = positive integer) 
obtained by compressively coding N pieces of object 
data which constitute individual data to be recorded or 
transmitted and have a hierarchical structure, for each 
object data. In this data structure, a data group compris- 
ing the N pieces of coded data includes an obied table 
on which tiie respective object data are correlated with 
coded data corresponding to the respective okqect data. 
Therefore, extraction, selection or reti-ieval of coded 
data corresponding to a specific object can be carried 
out easily and speedily with ref»ence to the object 
table. 

According to a twenty-sixth aspect of tiie present 
invention, there is provided a data structure for transmit- 
ting N pieces of coded data (N = positive integer) 
obtained by compressively coding scene data corre- 
sponding to one scene, for each of N pieces of objects 
constituting the scene. In this data structure, a data 
group comprising the N pieces of coded data includes 
an object table on which the respective objects are cor- 
related with coded data corresponding to the respective 
objects. Therefore, on the decoder side, extraction, 
selection or retrieval of a specific objected from plural 
objects (video and audio) constituting one scene can be 
earned out easily and speedily with reference to the 
object table. 

According to a twenty-seventh aspect of the 
present invention, there is provided an object data 
processing apparatus for processing multiplexed data 
including N pieces of coded data (N = positive integer) 
and being partitioned into plural packets each having a 
prescribed code quantity, which coded data are 
obtained by compressively coding N pieces of object 



data which constitute individual data to be recorded or 
transmitted and have a hierarchical structure, for each 
object data. This apparatus includes hierarchical infor- 
mation exti-action means for extracting hierarchical 

5 information showing the hierarchical relationship of the 
N pieces of object data, according to information show- 
ing tiie correlation of the respective coded data and 
included in the multiplexed data; and table aeation 
means for creating, according to the hierarchical infor- 

10 mation, an object table showing the hierarchical rela- 
tionship of tiie plural packets constibJtIng the 
multiplexed data. Therefore, extraction, selection or 
retrieval of coded data corresponding to a specific 
object on the basis of the multiplexed data can be car- 
is ried out easily and speedily with reference to the object 
table, and this enables the user to edit or replace the 
object data in short time with high controllability. 

According to a twenty-eighth aspect of the present 
invention, there is provided an object data processing 

20 apparahjs for processing multiplexed data including N 
pieces of coded data (N = positive integer) and being 
partitioned into plural packets each having a prescribed 
code quantity, which coded data are obtained by com- 
pressively coding scene data corresponding to one 

25 scene, for each of N pieces of objects constituting the 
scene. This apparatus includes hierarchical information 
extraction means for extracting hierarchical information 
showing tiie hierarchical relationship of the respective 
objects constituting tiie scene, according to information 

30 showing correlation of the respective coded data 
included in tiie multiplexed data; and table creation 
means for creating, according to the hierarchical infor- 
mation, an object table showing tiie hierarchical rela- 
tionship of the plural packets constituting tiie 

35 multiplexed data. Therefore, extraction, selection or 
reti-ieval of a specific object from plural objects (video 
and audio) constituting one scene on the basis of the 
multiplexed data can be carried out easily and speedily 
witti reference to the object table. 

40 According to a twenty-nintii aspect of tiie present 
invention, there is provided an oktject data recording 
apparatiJS having a data storage for storing data, and 
recording, in tiie storage, multiplexed data which 
includes N pieces of coded data (N = positive integer) 

45 and is partitioned into plural packets each packet having 
a prescribed coded quantity, which coded data are 
obtained by compressively coding N pieces of object 
data which constitute individual data to be recorded or 
ti-ansmitted and have a hierarchical structure, for each 

50 object data. This apparatus Includes hierarchical infor- 
mation exti-action means for extracting hierarchical 
information showing the hierarchical relationship of tiie 
N pieces of object data, according to information show- 
ing tiie correlation of ttie respective coded data and 

55 included in tiie multiplexed data; and table creation 
means for creating, according to the hierarchical infor- 
mation, an object table showing tiie hierarchical r^a- 
tionship of tiie plural packets constihjting the 
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multiplexed data. This apparatus records the multi- 
plexed data and the object table corresponding to the 
multiplexed data In the data storage. Therefore, extrac- 
tion, selection or retrieval of coded data corresponding 
to a specific object can be carried out easily and speed- 5 
ily with reference to the object table. 

According to a thirtieth aspect of the present inven- 
tion, there is provided an object data recording appara- 
tus having a data storage for storing data, and 
recording, in the data storage, multiplexed data which 10 
Includes N pieces of coded data (N = positive integer) 
and is partitioned into plural packets each having a pre- 
scribed coded quantity, which coded data are obtained 
by compressively coding scene data constituting one 
scene, for each of N pieces of objects constituting the 75 
scene. This apparatus includes hierarchical information 
extraction means for extracting hierarchical information 
showing the hierarchical relationship of the respective 
objects constituting the scene, according to information 
showing the correlation of the respective coded data 20 
and included in the multiplexed data; and table creation 
means for creating, according to the hierarchical infor- 
mation, an object table showing the hierarchical rela- 
tionship of the plural packets constituting the 
multiplexed data. This apparatus records the multl- 25 
plexed data and the object table con-esponding to the 
multiplexed data in the data storage. Therefore, on the 
decoder side, extraction, selection or retrieval of a spe- 
cific object from plural objects (video and audio) consti- 
tuting one scene can be carried out easily and speedily 30 
with reference to the object table. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a block diagram for explaining an object 35 
data decoding apparatus as an object data 
processing apparatus according to a first embodi- 
ment of the present invention. 
Figure 2 is a diagram showing an object table cre- 
ated by the object data decoding apparatus accord- 40 
ing to the first embodiment. 
Figure 3 is a flowchart for explaining process steps 
by a CPU in the object data decoding apparatus 
according to the first emtxxliment. 
Figure 4 is a diagram for explaining another object 4S 
table created in the object data decoding apparatus 
according to the first embodiment, illustrating a 
table corresponding to upper-layer objects and a 
table corresponding to lower-layer objects. 
Figure 5 is a block diagram for explaining an object so 
data selecting apparatus as an object data process- 
ing apparatus according to a second embodiment 
of the present invention. 

Figure 6 is a flowchart for explaining process steps 
by a CPU In the object data selecting apparatus ss 
according to the second embodiment. 
Figure 7 is a block diagram for explaining an object 
data recording apparatus according to a third 



embodiment of the present invention. 
Rgure 8 is a flowchart for explaining process steps 
by a CPU in the object data recording apparatus 
according to the third embodiment. 
Rgure 9 Is a block diagram for explaining an object 
data outputting apparatus as an object data 
processing apparatus according to a fourth emt)Od- 
iment of the present invention. 
Rgure 10 is a flowchart for explaining process 
steps by a CPU in the object data outputting appa- 
ratus according to the fourth embodiment. 
Rgure 1 1 is a diagram for explaining an object data 
decoding apparatus based on MPEG4 as an object 
data processing apparatus according to a fifth 
embodiment of the invention, illustrating the outline 
of a data transmission system tiased on MPEG4. 
Figure 1 2 is a block diagram for explaining an object 
data decoding apparatus as an object data 
processing apparatus according to the fifth embod- 
iment of the invention. 

Rgure 13 (a) and (b) is a schematic diagram for 
explaining an object coding method corresponding 
to the data transmission system shown in figure 1 1 . 
Figure 14(a) Is a diagram showing a scene descrip- 
tion and figure 14(b) is a diagram showing object 
descriptors, respectively used in the data transmis- 
sion system. 

Rgure 15 (a) and (b) Is a diagram showing an 
object table obtained from the scene description 
shown In figure 14(a) and the object descriptors 
shown in figure 14(b). 

Rgure 16 is a diagram showing a flow of process 
steps by a CPU in the object data decoding appara- 
tus according to the fifth embodiment. 
Rgures 17(a)-17(c) are diagrams for explaining a 
data storage medium according to the present 
invention, wherein figure 1 7(a) shows the structure 
of a floppy dIsK figure 17(b) shows the structure of 
a floppy disk body, and figure 1 7(c) shows a conpu- 
ter system using the floppy disk as a storage 
medium. 

Rgure 18 is a schematic diagram for explaining an 
object coding method according to the prior art. 
Figure 19 is a diagram showing a data structure of 
a bit stream obtained by multiplexing data coded by 
the prior art object coding method and auxiliary 
data. 

Rgure 20 is a diagram showing a scene description 
according to a composition stream Included In the 
bit stream shown in figure 1 7 as auxiliary data. 
Rgure 21 is a diagram showing a stream associa- 
tion table included in the bit stream shown in figure 
17 as auxiliary data. 
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DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 

[Embodiment 1] 

Figure 1 is a blcx:k diagram illustrating an object 
data decoding apparatus as an object data processing 
apparatus according to a first embodiment of the 
present invention. 

In figure 1, an object data decoding apparatus 101 
receives coded data corresponding to a single image 
(scene), performs decoding of the coded data, and out- 
puts regeneration data obtained by the decoding to the 
display unit 14. The coded data is identical to the multi- 
plexed bit stream MEg shown in figure 19 in which 
coded data obtained by object coding of scene data cor- 
responding to a single scene comprising plural objects 
are multiplexed with auxiliary data. The single scene 
corresponds to an image of each frame constituting a 
motion picture. The object data decoding apparatus 101 
successively decodes coded data of each frame input 
thereto, and successively outputs regeneration data 
con-esponding to each frame. 

More specifically, the object data decoding appara- 
tus 101 includes demultiplexer 11 and an AV 
(audio/video) decoder 12. The demultiplexer 11 selects 
and extracts a composition stream and a stream associ- 
ation table which are auxiliary data Dsub included in the 
multiplexed bit stream MEg, and outputs coded data Eg 
corresponding to the respective objects in the multi- 
plexed bit stream MEg, in units of the respective 
objects, according to a first control signal Conti . The AV 
decoder 12 decodes the coded data Eg con-esponding 
to each object according to a second control signal 
Cont2, and outputs regeneration data Rg corresponding 
to each scene. Further, the decoding apparatus 101 
includes a CPU (central processing unit) 13. The CPU 
13 decides a logical channel of a packet containing the 
coded data Eg of each object according to the stream 
association table, and supplies the first control signal 
Conti to the demultiplexer 1 1 according to the result of 
the decision. Further, the CPU 13 supplies information 
relating to location of each object in one scene and 
information relating to display start time of each object, 
as the second control signal Cont2, to the AV decoder 
12, on the basis of a scene description according to the 
composition stream. 

In this first embodiment, the CPU 13 creates an 
object table showing the con-espondences between the 
respective objects constituting one scene and coded 
data Eg of the respective objects in the multiplexed bit 
stream MEg, on the basis of the composition stream 
and the stream association table, and stores this table in 
a data storage inside the CPU 13. 

Figure 2 is a diagram for explaining an object table 
T1 corresponding to the scene 120 shown in figure 18. 

On this object table T1 , various kinds of information 
are entered, correlated with each object index (object 



id) which is an index for identifying each object being a 
constituent of the scene. Each object has its own value 
Old as its object index. 

To be specific, object indices are uniquely given to 

5 the respective object descriptors in the scene descrip- 
tion shown in figure 20, and entered to the object table 
T1 in the order of the object id values Old. 

In figure 2, object indices Oid=1 to Oid=5 are given 
to the descriptors 141 to 145 of the objects 121 to 125 

10 shown in figure 20, respectively, and object indices 
Oid=61 and Oid=62 are given to the descriptors 148 
and 149 of the objects 128 and 129 shown in figure 20, 
respectively Further, an object index Oid=8 is given to 
the descriptor 147 of the object 127 shown in figure 20. 

15 On the object table T1 , correlated with each object 
index (id), the following components are entered: a log- 
ical channel (LC) corresponding to the object; the 
stream type Q.e., whether the stream is video or audio); 
a stream index corresponding to the object; indices of 

20 upper and lower objects corresponding to the object; 
logical channels of the upper and lower objects; the 
index of an object which shares its logical channel with 
the object (common object id in figure 2), and the prior- 
ity order of the respective objects. In figure 2, OLC is a 

25 specific value of logical channel LC. 

An upper object of each object is an object which 
belongs to an upper layer in the hierarchical structure 
than a layer to which each object belongs. A lower 
object of each object is an object which belongs to a 

30 lower layer than a layer to which each object belongs. 

To be specific, the uppermost object, i.e., an upper 
object of the objects belonging to the first layer LI in fig- 
ure 18, is the object itself. So, the uppermost object has 
its object id value Oid=1 and its logical channel value 

35 0LC=3 for its upper object id value Old and its upper 
object logical channel value OLC. Further, the upper 
ot}jects of the respective objects belonging to the sec- 
ond layer L2 and the third layer L3 in figure 18 are the 
objects belonging to the first layer LI and composed of 

40 the objects of the second and third layers L2 and L3. 
Further, objects having no lower objects (lowermost 
objects) have lower object id values Oid=0. In this first 
embodiment, a specific object's being the uppermost 
object is shown by giving its object id as its upper object 

45 id, and a specific object's being the lowermost object is 
shown by giving Oid=0 as its lower object id. However, a 
special digit or symbol may be used for describing that 
a specific object is the uppermost or lowermost object. 
From the object table T1 so constructed, it can be 

50 seen that the object having Oid=2 as its object id is com- 
posed of four objects because ''Oid=4, 5, 6, 8" is 
described as its lower object id. 

Further, in figure 2, for the object 128 (129) which is 
a component of the object 1 26 corresponding to a node 

55 having no stream (in ttie description of figure 20, 
Node(2) 146), its object id is given as follows. That is, a 
digit showing that this is a node common to some 
objects is given as the upper column of its Old, and a 
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digit showing its actual object id value is given as the 
lower column of the Oid. 

Accordingly, from the object table T1 , it is found that 
the object 126 having its object id value Oid=6 is merely 
a node and there exists no coded data corresponding to s 
this node, and the stream of this object 126 is com- 
posed of the streams of the objects 128 and 129 having 
object ids Oid=61 and Oid=62, i.e., the streams of Sid=6 
and Sid=7, respectively 

Further, the object 1 22 having Oid=2 as its object id io 
corresponds to a node, like the object 126 having Oid=6 
as its object id. However, since this object 122 is defi- 
nitely described on the object table T1 , there exists a 
stream corresponding to this object 122, i.e., the stream 
of Sid=2. 15 

Information included in a stream corresponding to a 
node is information common to all objects belonging to 
the node, for example, composition information peculiar 
to the objects, such as common system clock, display 
start timing, decoding start time, and display position, 20 
and copyright information. 

A description is now given of the operation of the 
object data decoding apparatus 101 . 

When the multiplexed bit stream MEg shown in fig- 
ure 19 is input to the object data decoding apparatus 25 
101 , the composition stream and the sfream association 
table (auxiliary data Dsub) are extracted from the multi- 
plexed bit sti-eam MEg and supplied to tiie CPU 13 by 
the demultiplexer 1 1 . 

In the CPU 13, the con-espondences between the so 
respective objects constituting the scene 120 shown in 
figure 19 and ttieir logical channels LC are recognized 
and, according to the recognition, a first control signal 
Conti is output to the demultiplexer 1 1 . 

In the demultiplexer 1 1 , according to the first control 35 
signal Conti , streams con-esponding to the respective 
objects, which are allocated to plural packets in the mul- 
tiplexed bit stream, are output to the AV decoder 12 in 
object units. 

At tills time, in the CPU 13, information relating to 40 
each object's display position and display start time is 
extracted from tiie scene description according to the 
composition sti-eam, and the extracted information is 
output to the AV decoder 12 as a second control signal 
Cont2. 45 

In tiie AV decoder 12, a stream (a series of coded 
data Eg) con-esponding to each object output from the 
demultiplexer 11 is subjected to decoding. Decoded 
data corresponding to the respective objects are com- 
posited according to the second control signal Cont2 so 
from tine CPU 13 (i.e., information relating to object dis- 
play), and regeneration data Rg con-esponding to the 
scene 120 composed of plural objects is output. 

The decoding operation mentioned above is similar 
to that of the conventional object data decoding appara- ss 
tus. 

The object data decoding apparatus 101 according 
to tills first embodiment is characterized by ttiat tiie 



object table T1 shown in figure 2 is created by the CPU 
13. 

Hereinafter, a description is given of the object table 
creation process. Figure 3 is a flowchart showing the 
algorittim for creating the object table T1 by the CPU 13. 

Initially, in step SI , the composition stream is read 
into the data storage of the CPU 13. In step S2, the 
stream association table is read into the data storage of 
Uie CPU 13. In step S3, descriptors of the respective 
objects on tiie composition table are loaded into a proc- 
essor of tiie CPU 13 wherein an object id value Oid is 
given to each descriptor, whereby each object can be 
identified by tiie object id. 

In step S4, it is decided whettier or not the object of 
which descriptor has been loaded corresponds to a 
node and has no stream. When it corresponds to a node 
and has no stream. In step S1 1 , the layer of object of 
which descriptor is to be loaded is lowered by one, fol- 
lowed by step S3 wherein the descriptors of the lower 
objects being components of tiie object corresponding 
to the node are loaded into the CPU 13. 

V\/hen it is decided in step S4 ttiat the object of 
which descriptor has been loaded is not one corre- 
sponding to a node and having no sti-eam, in step S6, 
tiie object id value Oid is entered as a component of tiie 
object table. 

Thereafter, in step S7, sfream association table is 
interpreted and, according to the result of the interpreta- 
tion, various kinds of table components corresponding 
to each object are entered in the object table. The main 
table components are as follows: the logical channels 
(LC) corresponding to tiie respective objects, the prior- 
ity order of Uie respective objects, the sti-eam indices 
(id) con-esponding to the respective objects, and tiie 
stream type (i.e.. Video or Audio). Besides, the following 
table components are also entered: tiie indices of tiie 
upper and lower objects corresponding to ttie respective 
objects, tiie logical channels of the upper and lower 
objects, and tiie indices of objects which share tiieir log- 
ical channels with otiier objects. 

In step S8, it is decided whether entry of table com- 
ponents relating to objects tiiat belong to tiie same node 
as an object currently being processed by tiie CPU 13 
has been completed or not. When it has not been com- 
pleted, tiie control of the CPU 1 3 returns to step S3, fol- 
lowed by steps S4 to S8. On the otiier hand, when it is 
decided in step S8 that entry of table components has 
been completed with respect to all the objects of the 
node to which tiie object currenUy being processed 
belongs, the CPU control proceeds to step S9 wherein 
it is decided whettier or not tiie object cunently being 
processed is the uppermost-layer object in the hierar- 
chy 

When tiie object cun-entiy being processed is not 
ttie uppermost-layer object, a process of raising the 
object layer by one is can-led out in step S12, followed 
by the decision in step S9. On the other hand, when it is 
decided in step S9 ttiat the object cun-entiy being proc- 
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essed is the uppermost-layer object, the CUP corrtrol 
proceeds to step S10 wherein it is decided whether 
entry of table components has been completed or not 
with respect to all the objects constituting one scene. 

When it is decided that entry of table components s 
of all the objects has not been completed yet, the CPU 
control returns to step S3, followed by steps S4 to S9, 
S11 and SI 2. On the other hand, when the decision in 
step 10 is that entry of table components of all the 
objects has been completed, the object table creation io 
process by the CPU 13 is ended. 

In the object data decoding apparatus 101 accord- 
ing to this first embodiment, the object table so created 
is stored in the data storage of the CPU 13. The stored 
object table will be updated at every updating of the is 
composition stream and the stream association table so 
that it can con-espond to the updated Information. 
Accordingly, the object table is updated only when any 
of the objects constituting one scene is changed. 

When the multiplexed bit stream includes a flag 20 
showing that the updated composition stream and 
stream association table are sent, the object table may 
be updated only when the flag is newly sent. 

When the object data decoding apparatus 101 per- 
forms decoding to a specific object according to a 25 
request from the user, a togical channel LC correspond- 
ing to the specific object is specified on the basis of the 
object table stored in the data storage of the CPU 13, 
and only a packet having this logical channel LC is 
extracted from the multiplexed bit stream for decoding. 30 

For example, when only the object 122 (Oid=2) cor- 
responding to a node is subjected to decoding, in the 
multiplexed bit stream of the data structure according to 
the prior art, it is necessary to Interpret the composition 
stream and the stream association table by the CPU 35 
and specify the logical channel LC corresponding to the 
object 122. In this first embodiment, however, since the 
object table shown in figure 2 is included in the multi- 
plexed bit stream MEg. the logical channel LC corre- 
sponding to the object 122 (i.e.. OLC=6~9) can be 40 
specified in a moment, resulting in high-speed retrieval. 

Further, when the throughput of the decoding appa- 
ratus is low and it cannot decode all the objects, it is 
considered to decode only objects having high priori- 
ties. In this first embodiment, since the object table con- 45 
tains the priority order of the respective objects, it is 
easy to specify the logical channels of high-priority 
objects. 

Furthermore, since the object table contains the 
indices of objects that share a logical channel with other so 
objects, the following effect is expected. 

That is, in the bit stream according to this emlxxli- 
ment. the objects 124 and 125 having Oid=4 and Oid=5 
as their object ids (see figure 18) share coded data of 
the same logical channel LC (0LC=6). So, although ss 
these objects have different object id values and differ- 
ent stream id values, these objects have the same value 
(0LC=6) of corresponding logical channel LC. 



When coded data of a specific object is deleted 
from the multiplexed bit stream, according to the object 
id of an object relating to ttie specific object, a logical 
channel LC corresponding to the relevant object is 
decided and. thereafter, coded data of the specific 
olsject is extracted from the multiplexed bit stream. How- 
&/er, when the specific object shares its logical channel 
with the relevant object, if coded data of the specific 
object is deleted, coded data corresponding to the rele- 
vant object is gone, vi^ereby decoding of the relevant 
object cannot be can-ied out. 

In this first emljodiment, since the object tatjie T1 
contains the index of object that shares its logical chan- 
nel with another object, this index can be used for decid- 
ing vtrhether coded data of the object can be deleted or 
not. whereby the above-mentioned problem is avoided. 

As described above, in tills first embodiment of the 
invention, on the basis of the multiplexed bit stream 
including coded data Eg corresponding to plural 
objects, the object data showing the correspondences 
between ttie respective objects and the coded data is 
aeated in advance and. using the object table, extrac- 
tion, selection, or retrieval of a specific object is carried 
out. Therefore, as compared with tiie case vtrhere 
reti-ieval or the like of a specific object is carried out by 
interpreting information about the object included in the 
multiplexed bit st'eam each time, the processing quan- 
tity required for retrieval or the like is reduced, resulting 
in high-speed processing. 

Further, on Vne object table T1 . since the con-elation 
of plural objects constituting one scene and having a 
hierarchical structure is described clearly, even when 
plural objects are included in one object, replacement 
and edition of the objects are facilitated. 

Alttiough in tinis first embodiment the objects that 
share a logical channel are shown by ttieir object ids. 
flags such as "1" and "0" may be used to show only 
whether or not an object shares its logical channel with 
another object is required. In tills case, Vne object table 
is simplified. 

Further, the object table is not restricted to ttiat 
shown in figure 2. Hereinafter, a description is given of 
an object data decoding apparatus which creates an 
object table different from the object table shown in fig- 
ure 2, according to a modification of the first embodi- 
ment. 

Figure 4 is a diagram for explaining an object table 
CTeated by the decoding apparatus according to ttie 
nxxfification, wherein a table corresponding to upper- 
layer objects and a table con-esponding to lower-layer 
objects are illustrated. 

TTie object table T2 shown in figure 4 is different 
from ttie object table T1 shown in figure 2 in ttiat ttie 
table T2 does not have logical channels (LC) of ttie 
respective objects, and ttie table T2 is divided into two 
parts, i.e.. an upper-layer table T2a and a lower-layer 
table T2b. 

TTie process of creating the object table T2 is differ- 
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ent from the process flow shown in figure 3 only in that 
the logical channels (LC) are not entered as table com- 
ponents. So, the object table T2 can be created accord- 
ing to a process flow similar to the process flow shown 
in figure 3. 5 

By the way, as mentioned above, the brt stream is 
not always one in which coded data corresponding to 
the respective objects are multiplexed. In a bit stream In 
which coded data are not multiplexed, no stream asso- 
ciation table is included. So, no logical channel LC is io 
obtained from this bit stream. 

In such a bit stream, the logical channels in the mul- 
tiplexed bit stream correspond to streams, so that coded 
data of each object is specified using the stream id 
instead of the logical channel LC. '5 

Therefore, on the object table T2 shown in figure 4, 
like the object table T1 shown In figure 2, an object 
index (id) for identifying each object is entered to make 
the relationship between the object and the stream 
clear. 20 

Further, on the object table T2 shown in figure 4, in 
order to make the object hierarchy clear, "H" Is adopted 
as a code showing hierarchical information and 
described in the section of the kind of stream. With 
respect to the object having "H" in this section, the zs 
object table T2b (lower-layer table) corresponding to 
lower-layer objects included in this object is created. 

As described above, since the object table has a 
hierarchical structure, even when the hierarchy of 
objects constituting one scene inaeases, the size of the 30 
object table for each layer (T2a or T2b), which is a com- 
ponent of the object table (T2), does not increase. 
Therefore, when performing edition or replacement to 
the upper-layer objects, only the upper-layer object 
table (T2a) of which size is reduced is retrieved, 3S 
whereby detection of objects is facilitated. 

Although in the first embodiment and its modifica- 
tion, object tables are obtained from the multiplexed bit 
stream shown in figure 3 and the non-multiplexed bit 
stream shown in figure 4, respectively, ot)ject tables are 40 
not restricted thereto. 

For exarrple, in order to make the object table 
shown in figure 2 compact, from the table components 
shown in figure 2, the upper object id, the upper object 
LC, the priority order, the kind of stream, and the com- 45 
mon object id may be deleted. 

To the contrary, although the size of the object table 
is somewhat increased, in order to simplify the opera- 
tion such as edition or replacement of objects, the logi- 
cal diannel of the composition stream itself and the so 
logical channel of the stream association table itself 
may be described on the object table, or header infor- 
mation of streams corresponding to video and audio 
objects may be added to the table. 

As described above, the object table created by the 55 
object data decoding apparatus according to the 
present invention may have any structure as long as, on 
the table, the respective objects are con-elated with the 
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stream indices or logical channels of the objects. 

Furthermore, in the object coding method, there is 
a case where the multiplexing relationship is expressed 
by only the stream association table in the bit stream 
without including the composition stream in the bit 
stream, in order to simplify the structure of the decoding 
apparatus. 

In this case, although the object-to-stream corre- 
spondence is not uniquely defined, an object table 
excluding some of table components of the object table 
T1 shown in figure 2, such as the object id, the stream 
id, the kind of stream, the priority order, and the com- 
mon object id, may be created from the stream associa- 
tion table. In this case, the hierarchical relationship of 
the logical channels corresponding to coded data of the 
respective objects is clearly defined. Further, an object 
table excluding some of table components of the object 
table T2 shown in figure 4, such as the object id, the 
stream id, the kind of stream, and the priority order, may 
be created from the stream association table. 

[Embodiment 2] 

Rgure 5 is a block diagram for explaining an object 
data selecting apparatus as an object data processing 
apparatus according to a second emtxxjiment of the 
present invention. 

In figure 5, an object data selecting apparatus 102 
according to this second emtxxJiment selects and 
extracts coded data Sg of a specific object from a multi- 
plexed bit stream MEg, according to an instruction sig- 
nal lu corresponding to user's instruction. The 
multiplexed bit stream MEg is identical to that shown in 
figure 19 wherein coded data of plural objects are multi- 
plexed in units of the objects. 

The object data selecting apparatus 102 includes a 
multiplexed stream interpreter 61 , an object selector 62, 
and a buffer 64. The multiplexed stream interpreter 61 
detects a composition stream and a stream association 
table which are auxiliary data Dsub included in the mul- 
tiplexed bit stream MEg. The object selector 62 selects 
and extracts coded data corresponding to a specific 
object from the multiplexed bit stream MEg according to 
a control signal Cont. The buffer 64 is disposed between 
the multiplexed stream interpreter 61 and the object 
selector 62, and retains the multiplexed bit stream for a 
prescribed period of time. 

Further, the selecting apparatus 102 includes a 
CPU 63. The CPU 63 creates the object table T1 shown 
in figure 2 on the basis of the composition stream and 
the stream association table, and outputs a signal for 
selecting coded data of a specified object (control signal 
Cont) toward the object selector 62 according to an 
object specifying signal lu generated as a result of 
user's operation. Since coded data of the specific object 
is extracted from the multiplexed bit stream MEg, the 
contents described in the composition stream and the 
stream association table change. So, the CPU 63 
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rewrites the auxiliary data relating to the composition 
stream and the stream association table so that the 
stream and the table correspond to the extracted object, 
and outputs the data to the object selector 62. The 
object selector 62 adds the composition stream and the 5 
stream association table, which have been rewritten by 
the CPU 13, to the coded data of the extracted object 
when outputting the coded data. 

A description is given of the operation of the object 
data selecting apparatus 1 02 using a flowchart shown in 10 
figure 6. 

Initially, In step S71 , the object table T1 is created 
according to the multiplexed bit stream MEg input to the 
object data selecting apparatus 102. 

To be specific, the multiplexed bit stream is inter- is 
preted by the multiplexed stream Interpreter 61, and the 
composition stream and the stream association table 
which are auxiliary data Dsub included in the multi- 
plexed bit stream MEg are detected and supplied to the 
CPU 63. In the CPU 63, the object table T1 is created 20 
according to the auxiliary data Dsub and stored in the 
data storage. The process of creating the object table 
T1 is identical to that already described for the first 
embodiment. 

When a signal lu specifying an object is input by the 25 
user or the like (step S72), the CPU 63 retrieves the 
object table according to the object specifying signal lu, 
specifies the logical channel LC of the specified object, 
and sends the logical channel LC to the object selector 

62 as the control signal Cont (step S73). 30 

Subsequently, the CPU 63 rewrites the composition 
stream (step S74) and rewrites the stream association 
table (step S75) so that the stream and the table corre- 
spond to coded data of the specified object. 

To rewrite the composition stream and the stream 35 
association table is necessary because the correlation 
of objects included in the multiplexed bit stream 
changes between the input multiplexed bit stream and 
the multiplexed bit stream corresponding to the 
extracted object. While the rewriting is carried out, the 40 
multiplexed bit stream output from the multiplexed 
stream interpreter 61 is stored in the buffer 64. 

When coded data corresponding to each object 
included in the multiplexed bit stream is input through 
the buffer 64 to the object selector 62 (step S76), the 45 
object selector 62 decides whether the input coded data 
corresponds to the logical channel LC of the specified 
object or not. according to the control signal Cont corre- 
sponding to the object specifying signal from the CPU 

63 (step S77). so 

As the result of this decision, when the input coded 
data does not con-espond to the specified object, the 
next coded data is input to the object selector 62 (step 
S76). On the other hand, when the input coded corre- 
sponds to the specified object, the coded data is output ss 
as coded data corresponding to the specified object 
(step S78). The object selector 62 outputs the coded 
data of the selected object together with the rewritten 



composition stream and stream association table. 

Thereafter, the object selector 62 decides whether 
or not the output coded data is the last coded data of the 
specified object. As the result of this decision, when the 
output coded data is not the last one, above-mentioned 
steps S76~S79 are repeated. On the other hand, when 
the output coded data is the last one, it is decided 
whether output of coded data of all the specified objects 
is completed or not (step S80). When it is not completed 
yet, above-mentioned steps S76~S80 are repeated. On 
the other hand, when it is completed, the process of 
selecting coded data of specified objects is ended. 

When the object selector 62 outputs coded data of 
a specified object as described above, the transfer rate 
of the multiplexed bit stream output from the object data 
selecting apparatus 102 is lowered as compared with 
the transfer rate of the multiplexed bit stream input to 
this apparatus. So, the object selector 62 can change 
the transfer rate as desired. However, the transfer rate 
may be changed by independent means located on the 
output side of the selector 62. 

As described above, according to the second 
embodiment of the invention, on the basis of a bit 
stream in which coded data Eg con-esponding to plural 
objects are multiplexed, an object tatile on which the 
respective objects are correlated with the coded data 
thereof is created in advance. Using the object table, 
coded data corresponding to a specific object is 
extracted from the bit stream. Therefore, it is possible to 
extract or delete coded data of a specific object from the 
multiplexed bit stream at high speed in the middle of a 
transmission path or the like. 

Although in this second embodiment a specific 
object extracted is transmitted, a specific object 
extracted may be deleted to transmit the rest of the 
stream. 

[Emtjodiment 3] 

Rgure 7 is a block diagram for explaining an object 
data recording apparatus according to a second 
embodiment of the present invention. 

In figure 7, an object data recording apparatus 103 
includes a data storage 84, and an object data selecting 
unit 8 that selects and extracts data stored in the data 
storage 84. The recording apparatus 103 records the 
multiplexed bit stream MEg in the data storage 84, and 
retrieves or outputs coded data of a specified object 
from the stream stored in the data storage 84 according 
to user's instruction or the like. 

The object data selecting unit 8 includes a multi- 
plexed sfream interpreter 81 and an object selector 82. 
The multiplexed stream interpreter 81 detects a compo- 
sition stream and a stream association table which are 
auxiliary data Dsub included in the multiplexed bit 
stream MEg. The object selector 82 selects and 
extracts coded data corresponding to a specific object 
from tiie multiplexed bit sti'eam MEg according to a con- 
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trol signal Cont. 

Further, the object data selecting unit 8 Includes a 
CPU 83. The CPU 83 creates the object table T1 shown 
in figure 2 on the basis of the composition stream and 
the stream association table, and records the object s 
table in a prescribed region of the data storage 84, for 
example, a region where management information 
showing the contents of the storage 84 is recorded, or a 
region where object tables are managed collectively. 
The object table may be recorded in the same region as io 
where the multiplexed bit stream MEg is stored so that 
the table is positioned at the head of the stream MEg. 

Further, the CPU 83 outputs a signal for selecting 
coded data of a specified object (control signal Cont) 
toward the object selector 82 according to an object is 
specifying signal lu generated as a result of user's oper- 
ation. Furthermore, the CPU 83 rewrites the auxiliary 
data relating to the composition stream and the stream 
association table so that the stream and the table corre- 
spond to the selected object, and adds the rewritten 20 
data to the coded data of the selected object when the 
coded data is output. 

A description is given of the operation of the object 
data recording apparatus 103. 

Figure 8 Is a flowchart showing process steps of zs 
creating the otjject table. 

When the multiplexed bit stream MEg is input to the 
object data recording apparatus 103 and stored In the 
data storage 84 (step S91), an object table correspond- 
ing to the multiplexed bit stream is created in the object so 
data selecting unit 8 controlled by the CPU 83. 

More specifically, the multiplexed bit stream is input 
to the multiplexed stream interpreter 81 (step S92), and 
the composition stream and the stream association 
table which are auxiliary data Dsub included in the mul- 35 
tiplexed bit stream MEg are detected by the interpreter 
81 and supplied to the CPU 83. In the CPU 83, an object 
table (refer to figure 2) is created according to the auxil- 
iary data Dsub (step S93). The object table created by 
the CPU 83 is stored in the data storage 84 (step S94). 4o 
Thereafter, it is decided whether an instruction to end 
the process of selecting object data is input or not. 
When there is no end instruction, above-mentioned 
steps S91 ~S95 are repeated. When the end instruction 
is input, the object data selecting process is ended. 45 

The object table is stored in a prescribed region of 
the data storage 84, for example, a region where man- 
agement information showing the contents of the stor- 
age 84 is recorded, a region where object tables are 
managed collectively, or the same region as where the so 
multiplexed bit stream MEg is stored (in this case, the 
table Is stored at the head of the stream). 

When a signal specifying an object is input to the 
recording apparatus 103, in the CPU 83, the object 
table stored in the data storage 84 is retrieved, a logical 55 
channel LC corresponding to the specified object is 
specified, and the specified logical channel LC is output 
to the object selector 82. 



In the object selector 82, on the basis of the logical 
channel LC from the CPU 83, coded data correspond- 
ing to the specified logical channel is selected from the 
multiplexed bit stream, and the selected coded data Se 
is output. When the composition stream and the stream 
association table are changed due to the object selec- 
tion, the CPU 83 rewrites the stream and the table. The 
rewritten stream and table are input to the object selec- 
tor 82 wherein the rewritten stream and table are added 
to the selected object to be output. 

As described ak)0ve, according to the third embodi- 
ment of the present invention, in an apparatus for 
recording a multiplexed bit stream including coded data 
of plural objects constituting a single image, an object 
table on which the respective objects are related with 
the coded data thereof is created on the basis of the 
multiplexed bit stream, and the coded data and the 
object table corresponding to the coded data are 
recorded. Therefore, the recorded multiplexed bit 
stream is collectively managed by the object table, 
whereby the process of retrieving and outputting a 
desired object from the recorded multiplexed bit stream 
is performed at high speed. 

In the object data recording apparatus 103, since 
the data storage 84 can serve as a buffer, no buffer is 
disposed between the multiplexed stream interpreter 81 
and the object selector 82. However, a buffer as shown 
in figure 5 may be disposed between the Interpreter 81 
and the selector 82. 

[Embodiment 4] 

Figure 9 is a block diagram illustrating an object 
data multiplex coding apparatus 104a including an 
object data output unit 104 which is an object data 
processing apparatus according to a fourth embodiment 
of the present invention. 

The object data multiplex coding apparatus 104a 
comprises an encoder 87 and the object data output 
unit 104. The encoder 87 generates coded data corre- 
sponding to plural objects constituting one scene by 
coding data of the respective objects, multiplexes these 
coded data with a composition stream and a stream 
association table which are auxiliary data Dsub, and 
output the multiplexed data. The object data output unit 
104 adds an object table on which the respective 
objects are correlated with coded data of the objects to 
the multiplexed bit stream MEg output from the encoder 
87, and outputs the multiplexed bit stream MEg with the 
object table. 

The object data output unit 104 includes a multi- 
plexed stream interpreter 81 which detects the compo- 
sition stream and the stream association table which 
are auxiliary data Dsub included in the multiplexed bit 
stream MEg according to a control signal Conti , and a 
buffer 85 which temporarily stores the multiplexed bit 
stream MEg that is input to the buffer 85 through the 
multiplexed stream interpreter 81. Further, the output 
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unit 104 includes a CPU 83 which forms an object table 
T1 as shown in figure 2 on the basis of the composition 
stream and the stream association table, and a multi- 
plexer 86 which adds the object table to the multiplexed 
bit stream output from the buffer 85 according to a con- g 
trol signal Cont2. 

The operation of the object data output unit 104 will 
be described using a flowchart shown in figure 1 0. 

When the multiplexed bit stream generated by com- 
pressive multiplexing in the encoder 87 is input to the w 
object data output unit 104 (step S1 11), the CPU 83 
decides whether it is "start of scene" or "change of 
objects constituting one scene" (step S1 12). When the 
decision is neither of "start" and "change", the multi- 
plexer 86 is controlled by the control signal Conti so is 
that the input multiplexed bit stream is output as it is. 

On the other hand, when it is "start of scene" or 
"change of objects constituting one scene", the multi- 
plexed stream interpreter 81 is controlled by the control 
signal Conti so that the multiplexed bit stream MEg is 20 
input to the multiplexed stream interpreter 81 and proc- 
essed (step S1 13). In the CPU 83, the composition 
stream is detected from the multiplexed bit stream (step 
Si 14) and, subsequently, the stream association table 
is extracted from the multiplexed bit stream (step S1 1 5) 25 
and, further, an object table is created on the basis of 
the composition stream and the stream association 
table (step S1 16). 

Thereafter, in the CPU 83, at the time of scene start 
or scene change, the created object table is added at 30 
the head of the multiplexed bit stream output from the 
buffer 85 (step S1 17), and the multiplexed bit stream 
with the object table is output (step S1 18). 

Thereafter, it is decided whether an instruction to 
end the process of outputting the multiplexed bit stream 35 
from the encoder 87 is input or not (step S1 19). When 
there is the end instruction, the output process is ended. 
When there is no end instruction, above-mentioned 
steps S1 1 1 to S1 19 are repeated. 

As described above, according to the fourth ■» 
embodiment of the invention, the object data output unit 
104 receives a multiplexed bit stream obtained by multi- 
plexing coded data of plural objects constituting one 
scene, adds an object table on which the respective 
objects are con-elated with their coded data to the multi- 45 
plexed bit stream, and outputs the multiplexed bit 
stream with the object table. Therefore, it is not neces- 
sary for an object data decoding apparatus receiving 
the multiplexed bit sti-eam MEg with the object table to 
create an object table, whereby an object data decoding so 
apparatus providing the same effects as the decoding 
apparatijs according to the first embodiment can be 
realized with simplified structure. 

Employed as an object data decoding apparatus to 
which the multiplexed bit stream MEg and the object ss 
table are Input may be either a decoding apparatus in 
which the object table and the coded data of the respec- 
tive objects in the multiplexed bit stream MEg are stored 



in different storage regions or a decoding apparatus in 
which the object table and the coded data are stored in 
the same storage region. 

While in this fourth embodiment the object data out- 
put unit 104 outputs the multiplexed bit stream after 
adding the object table at the head of the stream, the 
struchjre of the output unit is not restricted thereto. For 
example, according to the application, the object table 
may be inserted in the middle of the multiplexed bit 
stream and, in this case, the capadly of the buffer 85 
can be decreased. 

Further, in this fourth embodiment, the encoder 87 
simply generates coded data conesponding to the 
respective objects, and the output unit 104 receives the 
multiplexed bit stream generated by the encoder and 
outputs the multiplexed bit stream after adding the 
object table to the sti-eam. However, the st-uctures of 
the encoder and the output unit are not restricted 
thereto. For example, the encoder may create the object 
table simultaneously with formation of the composition 
stream and the stream association table, and add the 
object table to the multiplexed bit stream separately 
from the composition stream and the stream associa- 
tion table when outputting the multiplexed bit sti-eam. 
Or, the object table may be output as a part of the com- 
position stream. In this case, the multiple stream inter- 
preter 81 in the output unit is dispensed witti, and a 
conventional multiplexer can be used in the output unit. 

Although in the second to fourth embodiments a 
multiplexed bit stream is described as input coded data, 
these embodiments are not restricted thereto. 

For example, an input bit stream may be a bit 
stream in which coded data are partitioned in units of 
objects as described for the modification of the first 
embodiment Also in this case, the same effects are 
obtained by creating the object table as shown in figure 
4. 

Especially, even when the object data output unit 
104 according to the fourth embodiment is consti-ucted 
so that it receives such a non-multiplexed bit stream and 
aeates the object table shown in figure 4, employed as 
an object data decoding apparatus which receives the 
multiplexed bit stream MEg with the object table output 
from the output unit may be either a decoding apparatus 
in which the object table and the coded data of the 
respective objects in tiie multiplexed bit sti'eam MEg are 
stored in different storage regions or a decoding appa- 
ratus in which the object table and the coded data are 
stored in the same storage region. 

Furthermore, although in the first to fourth embodi- 
ments a multiplexed bit stream including coded data 
corresponding to video data and audio data is 
described, any multiplexed bit sti-eam may be employed 
as long as it includes coded data of plural objects con- 
stituting individual information to be recorded or ti-ans- 
mitted. For example, the respective embodiments may 
employ a multiplexed bit data including, as coded data 
of objects, only coded data of video data, audio data, or 
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computer data, or a multiplexed bit data including coded 
data of other data. 

Furthermore, in the first to fourth embodiments, the 
object table is created on the basis of the composition 
table and the stream association table which are s 
included in the multiplexed bit stream MEg as auxiliary 
data. However, in an information transmission system 
corresponding to MPEG4 which is currently being 
standardized, the format of scene description is differ- 
ent from that according to the composition table, and an 10 
object descriptor showing the con-espondence between 
object id and stream id is employed in place of the 
stream association tatJie showing the correspondence 
between stream id and logical channel. 

A description is now given of an object data trans- is 
mission system according to MPEG4. 

Figure 11 is a diagram illustrating the structure of 
the object data transmission system 200. 

In this system 200, coded video data Ev and coded 
audio data Ea corresponding to objects constituting a 20 
single scene 201 and system information Si as auxiliary 
data are multiplexed by a multiplexer 202, and a multi- 
plexed bit stream MEgl obtained as a result of the mul- 
tiplication is transmitted through a transmission medium 
or stored in a storage medium. 25 

The multiplexed bit stream MEg1 transmitted 
through the transmission medium or read from the stor- 
age medium is demultiplexed (divided) into the coded 
data Ev and Ea, and the system information Si by a 
demultiplexer 203. 30 

To be specific, the scene 201 is composed of a 
background object OBI (scenery), a sound object 0B4 
attendant on the background object 081, a foreground 
object 0B2 (person), and a voice object 0B3 attendant 
on the foreground object 0B2. The coded video data Ev 35 
is divided into coded data Ev1 corresponding to the 
background object OBI and coded data Ev2 corre- 
sponding to the foreground object OB2. The coded 
audio data Ea is divided into coded data Eal corre- 
sponding to the voice object OB3 and coded data Ea2 40 
corresponding to the sound object 0B4. The system 
information Si as auxiliary data is divided into scene 
description information Sf and object descriptor OD. 

Receiving the respective data separated from the 
multiplexed bit stream MEg1 , a decoder 204 generates 45 
regeneration data Rg corresponding to the scene 201 
according to these data. 

That is. in the scene description information Sf, the 
hierarchical structure of the objects 0B1~0B4 is 
described together with the relationship between the so 
objects in each layer and their object indices. In the 
object descriptor OD, the relationship between the 
object indices and the stream indices (i.e., coded data 
corresponding to the objects) are described. 

Accordingly, the decoder 204 performs decoding ss 
and composition of coded data of the respective objects 
on the basis of the scene description information Sf and 
the object descriptor OD, and generates a regeneration 



data Rg for displaying the scene 201 . 

Furthermore, in the system 200 in figure 1 1 , the 
objects and the object descriptors are defined with the 
video data being distinguished from the audio data. 

[Embodiment 5] 

An object data decoding apparatus of a fifth embod- 
iment of the present invention in the object data trans- 
mission system 20 will now be described. 

Figure 12 is a t)lock diagram showing the object 
data decoding apparatus of the fifth embodiment. Note 
that the objects and the object descriptors are defined 
without distinguishing the video data from the audio 
data in this emtxxJiment. 

Referring to figure 12, an object data decoding 
apparatus 105 is shown. The object data decoding 
apparatus 105 is used for receiving a multiplexed bit 
stream MEgl comprising a scene description informa- 
tion Sf and an object descriptor OD rather than the com- 
position stream and the stream association table as the 
auxiliary data, and reproducing video data Rg of one 
scene from the multiplexed bit stream MEg1 . 

The multiplexed bit stream MEg1 comprises coded 
data in which scene data of one scene 150 in figure 
13(a)has been coded for each object of the scene, and 
Uie auxiliary data. 

Referring to figures 13(a) and 13(b), the scene 150 
comprises plural objects (small images) of a hierarchi- 
cal structure. More specifically, the scene 150 com- 
prises a background image 151 as a background, a 
mobile 152 moving in the background, logo (Let's start) 
153 displayed on the background image, and first and 
second wheels 154 and 155, which correspond to tiie 
objects . The background image 151 serves as a node, 
and the mobile 152 and the logo 153 belong tiiereto. 
Also, the mobile 152 serves as a node, and the first and 
second wheels 1 54 and 1 55 belong thereto. Coded data 
of the mobile 152 comprises coded data of a window 
152a. a body 152b, and a chimney 152c. 

The auxiliary data of the multiplexed bit stream 
MEgl comprises the scene description information and 
the object descriptor. Figure 14(a) shows a scene 
description SD1 on tiie basis of the scene description 
information. 

The scene description SD1 describes the scene 
150. In figure 14(a), Object(l) 161 to Object(5) 165 are 
shown, which are descriptors which indicate the back- 
ground image 151, the mobile 152, the logo 153, and 
the first and second wheels 154 and 155. respectively 
As is seen from tiiese descriptors, the mobile 152 and 
the logo 153 belong to the background image 151. and 
the first and second wheels 154 and 155 belong to the 
mobile 152. To each of the descriptors 161 to 165. 
Object id (Old) by which coded data of respective 
objects of ttie multiplexed bit stream MEgl can be iden- 
tified is given ("id" indicates index). Specifically, to the 
descriptors 161 to 165. Object id (Old) 1 to 5 are given. 
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respectively. 

Furthermore, in an object descriptor OD shown in 
figure 14(b), correspondence between Object Id and 
Stream Id is shown. As shown In figure 14(b), Object id 
(Oid=1), Object id (Oid=2), Object Id (Oid=3), and 5 
Object Id (Old=4. 5), correspond to Stream Id (Sid=1 , 2), 
Stream id (Sld=3 to 6), Stream id (Sid=7), and Stream 
id(Sid=8), respectively. 

Furthermore, in the object descriptor OD, data type of 
each object, I.e., "Video" or "Audio" is described. io 

The object data decoding apparatus 105 of the fifth 
embodiment will be described hereinafter. 

Referring to figure 12 again, a basic construction of 
the object data decoding apparatus 105 Is identical to 
that of the object data decoding apparatus 101 of the is 
first embodiment. Specifically, the decoding apparatus 
105 comprises a demultiplexer 11a for extracting the 
scene description information Sf and the object descrip- 
tor OD as auxiliary data Dsub included In the multi- 
plexed bit stream MEgl , and extracting coded data Eg 20 
of each object from the multiplexed bit stream MEgl in 
accordance with a first control signal Cont1 , an audio 
and video (AV) decoder 12 for decoding the coded data 
Eg in accordance with a second control signal Cont2 
and ouiputting reproduced data Rg of each scene, and 25 
a CPU 13a for deciding a stream id of the coded data Eg 
on the basis of the object desalptor OD and supplying 
the demultiplexer 1 la with the first control signal Conti 
on the basis of the decision result, and supplying the AV 
decoder 12 with Information on placement of objects of 30 
one scene and information on display start time of each 
object using the control signal Cont2 on the basis of the 
scene description information Sf . 

In this fifth embodiment, the CPU is used to create 
an object table indicating correspondence between 35 
objects and the corresponding coded data Eg (stream) 
on the basis of the scene description information Sf and 
the object descriptor OD, to be stored in a data storage 
means In the CPU 13a. 

Figure 15(a) and 15(b) are diagrams showing an 40 
object table T3 of the scene 150. 

Referring to these figures, the object table T3 has a 
hierarchical structure, and comprises a main table T3a 
Indicating a correspondence between objects of the 
scene 150 and the con-esponding streams, and a sub 45 
table T3b Indicating a conrespondence between video 
or audio of each object and the corresponding stream. 

In these tables, various Information associated with 
Object id Is entered. 

Specifically, the Object id is given to each object so 
descriptor In the scene description in figure 14(a) to 
have a one-to-one con-espondence, and entered in 
ascending order of value "Old" of the Object id . 

Object indices (Old = 1-5) are given to the descrip- 
tors 1 61 to 1 65 of objects 1 51 to 1 51 , respectively 55 

In the main table T3a, type of each object (video or 
audio), and the con-esponding stream id are entered. In 
the main table T3a, Indices of upper and lower objects 



of each object, stream indices of the upper and lower 
objects, common object indices of objects which share 
a stream, and priorities of respective objects are also 
entered. 

The upper object is in a higher-order layer than an 
object, and the lower object Is In a lower-order layer 
than the object. 

Specifically, since no upper object of an upper most 
object, i.e., the object in a first layer Lla in figure 13(a) 
exists. Old of the corresponding upper object Id and Sid 
of the corresponding upper stream id are respectively 
'0". Upper objects of objects In second and third layers 
L2a and L3a are objects in the first and second layers 
Lla and L2a, respectively. In case of a lower most 
object, i.e., an object having no lower object. Old of ttie 
corresponding lower object id and Sid of the corre- 
sponding lower stream id are respectively '0". 

In case of objects 151 and 152 each comprising 
plural pieces of video and audio, tine corresponding 
stream types *H" are described in the main table T3a, 
and stream indices of coded data of video and audio of 
these objects are described in the sub table T3b. 

As can be seen from the object table T3, an object 
having Oid=1 comprises two objects, since Old of the 
corresponding lower object id "2, 3" is described, and 
an object having Oid=2 comprises two objects, since 
Old of tiie corresponding lower object Id "4, 5" is 
described. 

Subsequently, operation of the object data decod- 
ing apparatus 105 of the fifth embodiment will now be 
described. 

Referring to figure 12 again, when the multiplexed 
bit stream MEgl is input to the object data decoding 
apparatus 105, the demultiplexer 11a extracts the scene 
description information SDI and the object descriptor 
OD as the auxiliary data Dsub from ttie multiplexed bit 
stream MEgl and outputs the Dsub to ttie CPU 13a. 

The CPU 13a recognizes a con-espondence 
between objects of the scene 1 50 In figure 1 3(a) and the 
corresponding stream indices on the basis of the object 
descriptor OD, and outputs the first conf ol signal Conti 
on tine basis of tine result to the demultiplexer 1 la. 

The demultiplexer 11a collects plural paketed 
streams in the multiplexed bit stream for each object 
and outputs ttie resulting sti-eam to the AV decoder 12. 

At tills time, the CPU 13a exti-acts information on a 
display position and display start time of each object 
from the scene description information SDI and outputs 
tiie information to the AV decoder 1 2 as the second con- 
ti-ol signal Cont2. 

The AV decoder 12 decodes sti-eams of respective 
objects (a series of coded data Eg) from the demulti- 
plexer 11a, synthesizes decoded data of respective 
otjjects in accordance with the second control signal 
Con12, and outputs reproduced data Rg of one scene 
comprising plural objects. 

This decoding is Identical to that of ttie prior art 
object decoding apparatus as already described. 



18 

Copied from 09964647 on 02/18/2005 



35 



EP0862 330 A2 



In this fifth embcxjiment, In addition to the decoding, 
the CPU 1 3a creates the object table T3 in figures 1 5(a) 
and 15(b). 

Hereinafter, creating of the object table T3 following 
a f towchart in figure 1 6. 5 

In step SI a, the scene description information Sf is 
read to the data storage means of the CPU 13a. In step 
S2a, the object descriptor OD is read to the data stor- 
age means of the CPU 13a. In step S3a, desaiptors of 
respective objects of the scene description SD1 on the io 
basis of the scene description information Sf are loaded 
into an operating unit of the CPU 13a, and Old of each 
object id is given to each descriptor, thereby an object 
can be identified by the corresponding object id. 

In step S6a, the Old Is entered in the object table T3 15 
as "id". 

In step S7a, the object descriptor OD is interpreted. 
On the basis of the interpretation result, various table 
components of respective objects are entered in the 
object table T3. The components are the stream id, the 20 
priority, the stream type (Video or Audio) of each object. 
At this time, as the table components, indices of the 
upper and lower objects, stream indices of the upper 
and lower objects, and the common object indices are 
also entered in the object table T3. 25 

In step S9a, it is decided whether the object which 
is being processed Is in an upper most layer or not. 

When decided in step S9a that the object is not in 
the upper most layer, in step S12a an object layer in 
hierarchy Is raised by one, and then in step S9a, the 30 
decision step is performed again. On the other hand, 
when decided in step S9a that the object is in the upper 
most layer, in step SlOa, it Is decided whether table 
components of all objects have been entered in the 
object table T3 or not. 35 

When decided in step SlOa that the components 
have not been entered, the CPU performs step S3a 
again and the steps S6a, S7a, S9a, and SI 2a are per- 
formed. On the other hand, when decided the compo- 
nents have been entered, the CPU 13a completes 40 
creating the object table. 

So created object table is stored in the data storage 
means of the CPU 13a. Each time the scene description 
information Sf and the object descriptor OD are 
updated, the stored table is also updated to describe 45 
newest information. Therefore, the object table remains 
unchanged unless an object of one scene is changed. 

When there is a flag in a bit stream indicating that 
new scene description Information Sf and the object 
desaiptor OD have been transmitted, the object table so 
may be updated only when the flag is transmitted. 

Also in the object data decoding apparatus 105, the 
same effects as provided in the object data decoding 
apparatus 101 of the first embodiment are obtained. 

Although in the fifth embodiment, the object data ss 
decoding apparatus has been described as the object 
data processing apparatus in the system according to 
MPEG4, the object data selecting apparatus of the sec- 



ond embodiment, the object data recording apparatus 
of the third embodiment, the object data output appara- 
tus of the fourth embodiment can respectively create 
the object table T3 from the scene description informa- 
tion Sf and the object descriptor OD. 

In addition, in the object data decoding apparatus 
which receives the multiplexed bit stream MEg output 
from the data output apparatus, creating the object table 
is dispensed with, and therefore the same effects as in 
the fifth embodiment are obtained with a simple con- 
struction. 

Although in this fifth embodiment, the objects and 
the object descriptors are defined without distinguishing 
the video data from the audio data, they may be defined 
with the video data being distinguished from the audio 
data as in the system 200 in figure 1 1 . In this case, 
since the data type is clearly shown in the scene 
description, it is not necessary to describe the data type 
in the object descriptor. 

Furthermore, a program which implements con- 
structions of the processing apparatus and the record- 
ing apparatus is recorded in a data recording medium 
such as a floppy disc, whereby processings in the 
embodiments are earned out in an independent compu- 
ter system with ease. This is described below. 

Rgure 17(a) to 17(c) are diagrams showing signal 
processing of the object data processing apparatus and 
the object data recording apparatus of the embodiments 
in a computer system using a floppy disc which stores a 
program of the signal processing. 

Figure 17(a) shows a front appearance and a 
CTOSS-section of a floppy disc FD, and a floppy disc body 
D as a recording medium, and Figure 17(b) shows a 
physical format of the floppy disc body D. 

Referring to figures 1 7(a) and 1 7(b), the floppy disc 
body D is stored in a case F, and in a surface thereof, 
plural tracks Trs are formed concentrically from outer to 
inner radius thereof, each track being divided into 16 
sectors Se in an angle direction. Data of the program is 
recorded in allocated areas on the floppy disc body D. 

Rgure 17(c) is a diagram showing a construction 
with which the program Is recorded/reproduced in/from 
the floppy disc FD. In case of recording the program in 
the floppy disc FD, data of the program is written thereto 
through the floppy disc drive FDD from the computer 
system Cs. In another case of constructing the image 
transmission method or image decoding apparatus in 
the computer system Cs using the program in the floppy 
disc FD, the program is read from the floppy disc FD by 
means of the floppy disc drive FDD and transferred to 
the computer system Cs. 

Although image processing in the computer system 
using the floppy disc as the data recording medium has 
been described, this image processing Is implemented 
using an optical disc. Further, the recording medium is 
not limited thereto, and IC card, ROM cassette, or the 
like may be used so long as it can record a program. 

Although the data recording medium which stores 
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the program of transmission or decoding in the embodi- 
ments has been described, it may store the multiplexed 
bit stream MEg or non-multiplexed bit stream in the 
embodiments. The data storage means of the recording 
apparatus of the third embodiment may be realized 5 
using the data recording medium in figures 17(a) to 
17(c). 

Claims 

1 . An object data processing apparatus for decoding 
N pieces of coded data (N = positive integer) 
obtained by compressively coding N pieces of 
object data which constitute individual data to be 
recorded or transmitted and have a hierarchical is 
structure, for each object data, said apparatus 
including: 

hierarchical information extraction means for 
extracting hierarchical information showing the 20 
hierarchical relationship of the N pieces of 
object data, according to the coded data; and 
table creation means for creating, according to 
the hierarchical Information, an object table on 
which the respective object data are correlated 25 
with coded data corresponding to the respec- 
tive object data. 

2. An object data processing apparatus for decoding 

N pieces of coded data (N = positive Integer) 30 
obtained by compressively coding scene data cor- 
responding to one scene, for each of N pieces of 
objects constituting the scene, said apparatus 
including: 

35 

hierarchical information extraction means for 
extracting hierarchical Information showing the 
hierarchical relationship of the respective 
objects constituting the scene, according to the 
coded data; and 40 
table CTeation means for creating, according to 
the hierarchical information, an object table on 
which the respective objects are correlated 
with coded data corresponding to the respec- 
tive objects. 45 

3. The object data processing apparatus of claim 2 
wherein: 

said hierarchical information extraction means so 
is constructed so that it extracts priority Infor- 
mation showing the priority order of the respec- 
tive objects, according to the coded data, in 
addition to the hierarchical information; and 
said table creation means is constructed so ss 
that it creates, according to the hierarchical 
information and the priority information, an 
object table on which the respective objects are 



correlated with coded data corresponding to 
the respective objects and the priority order of 
the respective objects are shown. 

4. The object data processing apparatus of claim 2 
further including: 

identification information detection means for 
detecting identification information for identify- 
ing coded data of a specific object designated, 
with reference to the object table; and 
decoding means for extracting coded data of 
the specific object from the N pieces of coded 
data according to the identification information, 
and decoding the extracted coded data. 

5. An object data processing apparatus for decoding 
multiplexed data including N pieces of coded data 
(N = positive Integer) obtained by compressively 
coding scene data corresponding to one scene, for 
each of N pieces of objects constituting the scene, 
said apparatus including: 

hierarchical Information extraction means for 
extracting hierarchical information showing the 
hierarchical relationship of the N pieces of 
objects constituting the scene, according to 
information showing the correlation of the 
respective coded data and Included in tiie mul- 
tiplexed data; and 

table creation means for creating, according to 
the hierarchical information, an object table on 
which the respective objects are con-elated 
with coded data corresponding to the respec- 
tive objects. 

6. The object data processing apparatus of claim 5 
wherein: 

said hierarchical information extraction means 
Is constructed so that it extracts priority infor- 
mation showing the priority order of the respec- 
tive objects, according to the multiplexed data, 
in addition to the hierarchical information; and 
said table creation means is constructed so 
that it creates, according to the hierarchical 
information and the priority Information, an 
object table on which the respective objects are 
correlated writh coded data corresponding to 
the respective objects and the priority order of 
the respective objects are shown. 

7. The object data processing apparatus of claim 5 
further including: 

identification information detection means for 
detecting identification information for identify- 
ing coded data of a specific object designated. 
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with reference to the object table; and 
decoding means for extracting coded data of 
the specific object from the multiplexed data 
according to the identification information, and 
decoding the extracted coded data. s 

8. An object data processing apparatus for selecting 
coded data of a specific object data from N pieces 
of coded data (N = positive integer) obtained by 
compressively coding N pieces of object data which 10 
constitute individual data to be recorded or trans- 
mitted and have a hierarchical structure, for each 
object data, said apparatus Including: 

hierarchical information extraction means for is 
extracting hierarchical Information showing the 
hierarchical relationship of the N pieces of 
object data, according to the coded data; and 
table creation means for creating, according to 
the hierarchical information, an object table on 20 
which the respective object data are correlated 
with coded data corresponding to the respec- 
tive object data; 

said apparatus selecting coded data of a spe- 
cific object data from the N pieces of coded 25 
data with reference to the object table and out- 
putting the selected coded data. 

9. An object data processing apparatus for selecting 
coded data of a specific object from N pieces of 30 
coded data (N = positive Integer) obtained by com- 
pressively coding scene data corresponding to one 
scene, for each of N pieces of objects constituting 
the scene, said apparatus Including: 

35 

hierarchical information extraction means for 
extracting hierarchical Information showing the 
hierarchical relationship of the respective 
objects constituting the scene, according to the 
coded data; and 40 
table creation means for creating, according to 
the hierarchical information, an object table on 
which the respective objects are correlated 
with coded data corresponding to the respec- 
tive objects; « 
said apparatus selecting coded data of a spe- 
cific object from the N pieces of coded data 
with reference to the object table and outputting 
the selected coded data. 

50 

10. An object data processing apparatus for selecting 
coded data of a specific object from multiplexed 
data including N pieces of coded data (N = positive 
integer) obtained by conpresslvely coding scene 
data corresponding to one scene, for each of N ss 
pieces of objects constituting the scene, said appa- 
ratus Including: 



hierarchical information extraction means for 
extracting hierarchical information showing the 
hierarchical relationship of the respective 
objects constituting the scene, according to 
Information showing the correlation of the 
respective coded data and included In the mul- 
tiplexed data; and 

table creation means for creating, according to 
the hierarchical information, an object table on 
which the respective objects are correlated 
with coded data corresponding to the respec- 
tive objects; 

said apparatus selecting coded data of a spe- 
cific object from the multiplexed data with refer- 
ence to the object table and outputting the 
selected coded data. 

11. An object data recording apparatus having a data 
storage for storing data, and recording N pieces of 
coded data (N = positive Integer) In said data stor- 
age, which coded data are obtained by compres- 
sively coding N pieces of object data which 
constitute Individual data to be recorded or trans- 
mitted and have a hierarchical structure, for each 
object data, said apparatus Including: 

hierarchical Information extraction means for 
extracting hierarchical information showing the 
hierarchical relationship of the respective 
object data, according to the coded data; and 
table creation means for creating, according to 
the hierarchical Information, an object table on 
which the respective object data are correlated 
with coded data con-esponding to the respec- 
tive object data; 

said apparatus recording the N pieces of coded 
data and the object table corresponding to 
these coded data in said data storage. 

12. An object data recording apparatus having a data 
storage for storing data, and recording N pieces of 
coded data (N = positive integer) In the data stor- 
age, which coded data are obtained by compres- 
sively coding scene data con-esponding to one 
scene, for each of N pieces of objects constituting 
the scene, said apparatus including: 

hierarchical information extraction means for 
extracting hierarchical information showing the 
hierarchical relationship of the respective 
objects constituting the scene, according to the 
coded data; and 

table creation means for creating, according to 
the hierarchical Information, an object table on 
which the respective objects are correlated 
with coded data corresponding to the respec- 
tive objects; 

said apparatus recording the N pieces of coded 
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data and the object table corresponding to 
these coded data in said data storage. 

13. The object data recording apparatus of daim 12 
wherein the object table corresponding to the N 
pieces of coded data is added to the N pieces of 
coded data when being recorded. 

14. The object data recording apparatus of daim 12 
wherein the object table corresponding to the N 
pieces of coded data is separated from the N 
pieces of coded data when being recorded. 

15. An object data recording apparatus having a data 
storage for storing data, and recording multiplexed 
data induding N pieces of coded data (N = positive 
integer) in said data storage, which coded data are 
obtained by compressively coding scene data cor- 
responding to one scene, for each of N pieces of 
objects constituting the scene, said apparatus 
including: 

hierarchical information extraction means for 
extracting hierarchical information showing the 
hierarchical relationship of the respective 
objects constituting the scene, according to 
information showing the correlation of the 
respedive coded data and included in the mul- 
tiplexed data; and 

table aeation means for aeating, according to 
the hierarchical information, an object table on 
which the respective objects are con-elated 
with coded data corresponding to the respec- 
tive objects; 

said apparatus recording the multiplexed data 
and the object table con-esponding to the multi- 
plexed data in said data storage. 

16. The object data recording apparatus of daim 15 
wherein the object table corresponding to the multi- 
plexed data is added to the multiplexed data when 
being recorded. 

17. The object data recording apparatus of daim 15 
wherein the object table con-esponding to the multi- 
plexed data is separated from the multiplexed data 
when being recorded. 

18. A data storage medium containing relevant data 
relating to individual data to be recorded or trans- 
mitted, wherein said relevant data indudes an 
object table on which N pieces of object data (N = 
positive integer) constituting the individual data and 
having a hierarchical structure are correlated with N 
pieces of coded data obtained by compressively 
coding the respective object data. 

19. A data storage medium containing relevant data 



corresponding to one scene, wherein said relevant 
data indudes an object table on which N pieces of 
coded data (N = positive integer) obtained by com- 
pressively coding scene data corresponding to one 
scene for each of N pieces of objects constituting 
the scene are correlated with the respective 
objects. 

20. An object data processing apparatus for outputting 
N pieces of coded data (N = positive integer) 
obtained by compressively coding N pieces of 
object data vi^ich constitute individual data to be 
recorded or transmitted and have a hierarchical 
structure, for each object data, said apparatus 
induding: 

hierarchical information extraction means for 
extracting hierarchical information showing the 
hierarchical relationship of the N pieces of 
object data, according to the coded data; and 
table creation means for creating, according to 
the hierarchical information, an object table on 
which the respective object data are con^elated 
with coded data con-esponding to the respec- 
tive object data; 

said apparatus outputting the N pieces of 
coded data to which the object table corre- 
sponding to these coded data is added. 

21. An object data processing apparatus for outputting 
N pieces of coded data (N = positive integer) 
obtained by compressively coding scene data cor- 
responding to one scene, for each of N pieces of 
objects constituting the scene, said apparatus 
induding: 

hierarchical information extraction means for 
extracting hierarchical information showing the 
hierarchical relationship of ttie respective 
objects constituting the scene, according to ttie 
coded data; and 

table creation means for creating, according to 
the hierarchical information, an object table on 
which the respective objects are con-elated 
with coded data con-esponding to ttie respec- 
tive objects; 

said apparatus outputting tine N pieces of 
coded data to which the object table con-e- 
sponding to these coded data is added. 

22. An object data processing apparatus for decoding 
data output from the object data processing appa- 
ratus according to daim 21 , including: 

data separation means for separating the 
object table from the output data; and 
table storage means for storing tiie separated 
object table; 
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wherein decoding of the coded data cor- 
responding to the respective objects is control- 
led using the information shown in the object 
table stored in the table storage means. 

23. An object data processing apparatus for oulputting 
multiplexed data Including N pieces of coded data 
(N = positive integer) obtained by compressively 
coding scene data corresponding to one scene, for 
each of N pieces of objects constituting the scene, 
said apparatus including: 

hierarchical information extraction means for 
extracting hierarchical information showing the 
hierarchical relationship of the respective 
objects constituting the scene, according to 
information showing the correlation of the 
respective coded data and included in the mul- 
tiplexed data; and 

table creation means for creating, according to 
the hierarchical information, an object table on 
which the respective objects are con'elated 
with coded data corresponding to the respec- 
tive objects: 

said apparatus oulputting the multiplexed data 
to which the object table corresponding to the 
multiplexed data is added. 

24. An object data processing apparatus for decoding 
data output from the object data processing appa- 
ratus according to claim 23, including: 

data separation means for separating the 
object table from the output data; and 
table storage means for storing the separated 
object table; 

wherein decoding of the coded data cor- 
responding to Vne respective objects is conti-ol- 
led using the information shown in the object 
table stored in the table storage means. 

25. A data sti^ucture for ti-ansmitting N pieces of coded 
data (N = positive integer) obtained by compres- 
sively coding N pieces of object data vi/hich consti- 
tute individual data to be recorded or transmitted 
and have a hierarchical structure, for each object 
data: 

wherein a data group comprising the N 
pieces of coded data includes an object table on 
which the respective object data are cwrelated with 
coded data con-esponding to the respective object 
data. 

26. A data sfucture for ti^nsmitting N pieces of coded 
data (N = positive integer) obtained by compres- 
sively coding scene data corresponding to one 
scene, for each of N pieces of objects constituting 
the scene: 



wherein a data group comprising the N 
pieces of coded data includes an object table on 
which the respective objects are correlated witti 
coded data corresponding to the respective 
objects. 

27. An object data processing apparatus for processing 
multiplexed data including N pieces of coded data 
(N = positive integer) and being partitioned into plu- 
ral packets each having a prescribed code quantity, 
which coded data are obtained by compressively 
coding N pieces of object data which constitute indi- 
vidual data to be recorded or transmitted and have 
a hierarchical sti-ucture, for each object data, said 
apparatus including: 

hierarchical information extraction means for 
extracting hierarchical information showing the 
hierarchical relationship of the N pieces of 
object data, according to information showing 
the con-elation of ttie respective coded data 
and included in the multiplexed data; and 
table creation means for creating, according to 
the hierarchical information, an object table 
showing the hierarchical relationship of the plu- 
ral packets constituting the multiplexed data 

2& An object data processing apparatus for processing 
multiplexed data including N pieces of coded data 
(N = positive integer) and being partitioned into plu- 
ral packets each having a prescribed code quantity, 
which coded data are obtained by compressively 
coding scene data corresponding to one scene, for 
each of N pieces of objects constituting the scene, 
said apparatus including: 

hierarchical information exti-action means for 
extracting hierarchical information showing the 
hierarchical relationship of the respective 
objects constituting tiie scene, according to 
information showing tiie correlation of tiie 
respective coded data included in the multi- 
plexed data; and 

tatjie creation means for creating, according to 
the hierarchical information, an object table 
showing the hierarchical relationship of the plu- 
ral packets constituting the multiplexed data. 

29. An object data recording apparatus having a data 
storage for storing data, and recording, in said stor- 
age, multiplexed data which includes N p'eces of 
coded data (N = positive integer) and is partitioned 
into plural packets each packet having a prescribed 
coded quantity, which coded data are obtained by 
compressively coding N pieces of object data which 
constitute individual data to be recorded or trans- 
mitted and have a hierarchical sti-urture, for each 
object data, said apparatus Including: 
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hierarchical information extraction means for 
extracting hierarchical information showing the 
hierarchical relationship of the N pieces of 
object data, according to information showing 
the correlation of the respective coded data 5 
and included in the multiplexed data; and 
table creation means for creating, according to 
the hierarchical information, an object table 
showing the hierarchical relationship of the plu- 
ral packets constituting the multiplexed data; 10 
said apparatus recording the multiplexed data 
and the object table con-esponding to the multi- 
plexed data in said data storage. 

30. An object data recording apparatus having a data is 
storage for storing data, and recording, in said data 
storage, multiplexed data which includes N pieces 
of coded data (N = positive integer) and is parti- 
tioned into plural packets each having a prescribed 
coded quantity, which coded data are obtained by 20 
compressively coding scene data constituting one 
scene, for each of N pieces of objects constitijting 
the scene, said apparatus including: 

hierarchical information extraction means for 25 
extracting hierarchical information showing the 
hierarchical relationship of the respective 
objects constituting the scene, according to 
information showing the correlation of the 
respective coded data and included in the mul- 30 
tiplexed data; and 

table creation means for creating, according to 
the hierarchical information, an object table 
showing the hierarchical relationship of the plu- 
ral packets constituting the multiplexed data; ss 
said apparatus recording the multiplexed data 
and the object table corresponding to the multi- 
plexed data in said data storage. 
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